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Section 


1 
Common Features and Options 


The menu items and commands listed under “Global features” on page 2 can be issued 


as part of any graphical or statistical procedure. 
“Functions and Operators” on page 52 describes functions and operators that can 


be used in LET, IF... THEN LET, and SELECT statements. These commands can be 
specified as part of any procedure. 
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Global features 


Global features 


Here is how we group the commands in this section: 


General usage 
Restructuring files 
Editing data files 


Format, layout, and titles 
Help 
Case selection and transformations 


Displaying, printing, and saving text and graphs Grouping variables and BY groups 


The calculator 


HOT indicates commands that are executed immediately. All other commands are 
cold—they are executed when the next HOT command is encountered. 


General usage 


USE command 


(HOT) USE filename 


Reads a data file named filename.SYZ. You do not have to enclose the filename and path 
in quotation marks unless the path or name contains spaces. In the absence of a 
designated path, SYSTAT searches for the file in the directories defined by FPATH for 
input data files (USE), temporary data files (WORK), and output data files (SAVE). The 
date and time of a file’s creation appear in the output when that file is used. 


/ NAMES 


NONAMES 
COMMENT 
DICTIONARY 
MATRIX =matrixname 
MTYPE= NUMERIC 
STRING 
ROWNAME =var$ 
COLNAME= var$ 


Suppresses the date and time information, displaying only the names 
of the variables in the data file. 


neither the variable names nor the file’s date and time are displayed. 
displays the file comments after the variable names. 

displays the file comments, variable names, and variable comments. 
reads the file as a matrix with specified name. 

reads all numeric variable(s) as a matrix. This is the default. 

reads all string variable(s) as a matrix, 

uses var$ to name the rows of matrix. 

uses var$ to name the columns of matrix. 
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Global features 


NEW command 


(HOT) NEW 


Creates an empty data file. NEW is equivalent to USE with no file name. 


GET command 


GET filename 


Reads the ASCII (plain text) file filename.DAT (use with INPUT). 


INPUT command (free format) 


INPUT varlist 


Names the variables in the order in which they will be read into SYSTAT from a text 
file or typed from the keyboard. You can identify a range of variables іп varlist using 
subscript notation. Variable names can be up to 256 characters long, and for string 
variables, labels must be followed by a $. 


\ for free-format input, place a backslash after varlist to force SYSTAT to start a new case for 
each line of data and to use every value entered in each row, even if it must start filling new 


cases to do so. 
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Global features 


INPUT command (fixed format) 


INPUT (varlist) (format) 


For fixed-field input, INPUT has two arguments, each enclosed in parentheses. varlist 
indicates the variable names, in order. format is a format description in special notation 
described below: 


#n read a numeric value in the next n columns 
$n read a character value in the next n columns 
> move the pointer one column to the right 

< move the pointer one column to the left 

^n move the pointer to column n 


1 move the pointer to the first column on the next record 
%n move the pointer to the first column on the nth record 
\ leave the pointer on the current record for next case 


nr repeat rn times, where ris any of the above 
Example: 


INPUT (SEX$ AGE WEIGHT) ($1 #3 #4) 


The first column, which contains character values, is read into SEX$. The next three 
columns are read into AGE. The next four columns are read into WEIGHT. 
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Global features 


TYPE command 


TYPE argument 


Specifies the type of matrix you are entering. Use DIAGONAL ABSENT if the diagonal 
values are missing. 


Argument Description 
RECTANGULAR cases-by-variables raw data matrix 
SSCP sum of squares cross-products matrix 
COVARIANCE covariance matrix 
CORRELATION correlation matrix 
DISSIMILARITY dissimilarity matrix 
SIMILARITY similarity matrix 
DIAGONAL command 
DIAGONAL PRESENT 
ABSENT 


Specifies whether the matrix you are entering has values in the diagonal cells. The 
diagonal is assumed to be present unless you state DIAGONAL ABSENT. 


IMPORT command 


(HOT) IMPORT filename 


Imports the data file filename. You do not have to enclose the filename and path in 
quotation marks unless the path or name contains spaces. To save a file, issue an 
ESAVE after IMPORT command. If do not save it, SYSTAT creates a file with the same 
name as the one you are importing, but with the .SYZ extension. For ODBC, omit 
filename; the connect string identifies the file to access. 
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Global features 


S-PLUS files can only be imported via commands. Logical values in a DIF file have the 
values 0 and | to represent false or true, and are stored as 0 or 1. 


| TYPE=filetype specifies the format of the file being imported. Valid 
format filetypes: SIGMAPLOT,ASCII, SPSS, SAS, SAS- 
TRANSPORT, MINITAB, STATA, STATISTICA, JMP, PLS 
(for S-PLUS files), STATVIEW, LOTUS, LOTUS2, EXCEL, 
DBASE, DIF,ARCVIEW, and ODBC. 


CONNECT='connect string’ for ODBC, lists the data source and the database to access. 
CONNECT must be specified for ODBC import. 


TABLE='tablename' for ODBC, lists the table to access. TABLE must be specified 
for ODBC import. 
VARIABLES- varlist" lists variables to import using ODBC. The variable list must 


be enclosed in quotes. Omitting this option results in all vari- 
ables being imported. 


SQL="'statement’ defines an SQL statement for ODBC import. 
SHEET = 'sheet number’ specifies the sheet number for EXCEL. 


Example: 
IMPORT MYFILE.XLS / TYPE = EXCEL 


ESAVE command 


(HOT) ESAVE filename 


Saves the active data file to a SYSTAT file (filename.SYZ). ESAVE is equivalent to 
DSAVE command. 


/| TYPE=RECTANGULAR defines the structure of the saved file. RECTANGULAR cor- 
55СР responds to standard cases-by-variables files. The other 
сик. types denote matrix files. 

DISSIMILARITY 
SIMILARITY 


‘text’ adds the specified text as a comment for the saved file. View 
comments for a file using the COMMENT or 
DICTIONARY options of the USE and NAMES commands. 
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DSAVE command 


Global features 


(HOT) DSAVE filename 
It is an alias of ESAVE. 


EWORK command 


(HOT) EWORK filename 


Saves the active data file to a temporary file (filename.SYZ). All files created using 
EWORK are deleted at the end of the session. 


1 TYPE=RECTANGULAR 
SSCP 


COVARIANCE 
CORRELATION 
DISSIMILARITY 
SIMILARITY 
DIR 
DWORK command 


defines the structure of the saved file. RECTANGULAR cor- 
responds to standard cases-by-variables files. The other types 
denote matrix files. 


in the absence of a specified filename, EWORK/DIR lists the 
files in the current WORK directory defined by FPATH. If 
you specify a filename, the DIR option is ignored. 


(HOT) DWORK filename 
It is an alias of EWORK. 


PUT command 


(HOT) PUT filename 


Saves data from the current SYSTAT file in a plain text (ASCII) file named 


filename.DAT. 
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Global features 


EXPORT command 


(HOT) EXPORT output filename 


Translates the current SYSTAT data file to the destination file filename in the format 
specified with TYPE. You do not have to enclose the filename and path in quotation 
marks unless the path or name contains spaces. S-PLUS files can only be exported via 
commands. STATISTICA does not support string type, so character variables cannot 
be exported. 


| TYPE-filetype specifies the destination format. Valid format names: SAS, SPSS 
MINITAB, STATA, STATISTICA, JMP, PLS (for S-PLUS files), 
SASTRANSPORT, ASCII, LOTUS, LOTUS2, EXCEL, DBASE, and 
DIF. 


Examples: 

EXPORT MYFILE.TPT / TYPE = SASTRANSPORT 
EXPORT MYFILE.SAV /TYPE=SPSS 

EXPORT MYFILE.MTW /TYPE=MTW 

EXPORT MYFILE.STA /TYPE=STA 

EXPORT MYFILE.JMP /TYPE=JMP 
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LIST command 


Global features 


(HOT) LIST varlist 


Lists the values of the variables in varlist (in the order you select). The FORMAT option 
controls the layout. When an IDVAR is selected, it is the first entry for each case. 


/| FORMAT=m,n or 
‘picture format 


N=n 


LABEL 


Examples: 


specifies how numeric and string variables are printed 
(including blanks, field width, and the number of decimal 
places following the decimal). Use m,n to specify the for- 
mat for each numeric variable—m is the number of charac- 
ter spaces in each field; n is the number of digits following 
the decimal point (0 <= n <= 14). Default is 12 for m, 3 for 
п. You can specify п alone. 

Picture format includes one symbol for each character, digit, 
or symbol that you want to print (blanks are spaces in the 
output). Use a pound sign (#) to indicate each digit of a 
number and a period (.) to indicate the decimal point, if 
needed. To right-justify, use (<); to left-justify, use (7); or to 
print string variables without right- or left-justification, use 
($). For dates and times, use Y (years), M (months), D 
(days), h (hours), m (minutes), and s (seconds). 

lists the first n cases in the file. When used with BY 
GROUPS, the first n cases in each group are listed. 


inserts labels in the data listing instead of values from the 
data file. 


LIST MURDER ROBBERY STATES 
Lists the indicated variables in the default format. 


LIST WEIGHT FIRSTS MIDL$ LAST$ / FORMAT = ' ###.## <<<<< >>>>> $$$$$ ' 


Lists data as: ' 165.25 JOE B SMITH ' 
PRINT command 
(HOT) PRINT varlist 


'string' 


Displays the values of the variable(s) listed in varlist, or displays the character string you 
specify. varlist can include numeric or string variables. Variable names and case numbers are 
not printed. Comma between the variables in PRINT command is now mandatory. 


10 


Global features 


NOTE command 


NOTE n, 'line1' 'line2'... 


Prints any note (character string) in the text output (or to the printer or file if output is 
redirected) using center-justification. NOTE can print ASCII characters by their index; 
specify the index number without quotation marks. For example, NOTE 13 puts a 
carriage return in the output. You can specify both an index and a character string in a 
single line. 


LNOTE command 


LNOTE n, 'linel' line2'... 

Adds the specified string(s) to the text output (or to the printer or file if output is 
redirected) using left-justification. LNOTE can print ASCII characters by their index; 
specify the index number without quotation marks. You can specify both an index and 
a character string in a single line. 


SUBMIT command 


(HOT) SUBMIT filename 


Executes commands in filename.SYC. Filename must be a text file with one command 
per line, and it should have a .SYC extension. 
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Global features 


REM command 


REM comment 


Inserts comments in a command file. All text that follows REM on the same line is 
ignored. You can also use !! sign also to comment a line. IF you insert !! symbol in a 
command line, the text after the !! symbol will be treated as comment. 


TOKEN command 


TOKEN &token 
TOKEN &token=value 


Assigns a replacement value to a token іп a command file. Tokens serve as substitution 
markers: the item must be replaced for the command to be successfully processed. The 
ampersand (&) character indicates that the immediately following text is a token. 
Tokens represent variable names, numbers, text strings, or filenames. When SYSTAT 
encounters a token that has been assigned a replacement value, all subsequent 
instances of that token are replaced by that value. If the token has not been assigned, 
processing pauses and a dialog prompts the user to specify the replacement value. 
Assigning a value allows processing to continue. Use TOKEN without an argument to 
reset all assigned tokens. 

To prevent SYSTAT from interpreting text following an ampersand as a token, use 
two consecutive ampersands in the command file. For example, to title a graph 
“Northwest&Southeast”, use the following: 

(plot command) | TITLE='Northwest&&Southeast’ 
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Global features 


/ ON 
OFF 


TYPE = 


MESSAGE 
OPEN 
SAVE 
STRING 
NUMBER 
INTEGER 
VARIABLE 


CVARIABLE 
NVARIABLE 
MULTIVAR 


NMULTIVAR 


CMULTIVAR 


SEPARATOR=char 


PROMPT="text’ 


LIST 
IMMEDIATE 


Examples: 


toggles token processing on and off. Setting processing to OFF 
results in no token substitution. The last processed TOKEN com- 
mand defines the current state of token processing. 

defines the type of dialog used to assign a value to a token. Com- 
mand processing halts until the user supplies a value consistent 
with the specified type. SYSTAT uses a generic token substitu- 
tion dialog if no TYPE is specified. To restrict the type of accept- 
able information for a token, select one of the following eleven 
types: 

provides the user with information about the template. 

allows the user to select a file to be opened. 

allows the user to specify a name for saving files. 

allows the user to input a string. 

allows the user to input a number. 

allows the user to input an integer. 


allows the user to select a numeric or string variable from the 
current file. 

allows the user to select a string variable from the current file. 
allows the user to select a numeric variable from the current file. 
allows the user to select multiple variables from the current file. 
The user can select any combination of numeric and string vari- 
ables. If desired, use the PROMPT option to add instructions 
regarding the number of variables to select. 

allows the user to select multiple numeric variables from the cur- 
rent file. If desired, use the PROMPT option to add instructions 
regarding the number of variables to select. 

allows the user to select multiple string variables from the current 
file. If desired, use the PROMPT option to add instructions 
regarding the number of variables to select. 

for a token representing multiple variables, specifies a character 
appearing between the variable names. The separator character 
does not appear before the first variable or after the last variable. 
By default, SYSTAT uses a space as the separator character. 


defines the prompting text appearing in the replacement dialog. 
The text string can contain up to 100 characters. 


lists all tokens with their assigned values. 


requests to appear the prompting dialog when SYSTAT 
processes the TOKEN staement. 


TOKEN &XVAR = POP_1999 


Replaces all occurrences of &XVAR with “РОР 1999” during command processing. 
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Global features 


TOKEN &SYMBOL / TYPE=INTEGER, PROMPT='Enter a value between 1 and 22 
(inclusive) to specify the plot symbol.’ 


When SYSTAT encounters &SYMBOL during processing, a dialog prompts the user for 
a plot symbol value. All references to &SYMBOL are replaced by the designated value. 


LINK command 


LINK file2.SYZ file1.SYC 


Executes commands in file1.SYC whenever file2.SYZ is read using USE. 


DIALOG command 


DIALOG name 


Displays the dialog box corresponding to name. The software suspends processing of 
any pending commands until you close the dialog box. If you click the OK button, 
command processing continues at the line following DIALOG. The Cancel button aborts 
command processing. Any dialog box that requires data cannot be accessed until you 
open a data file. 

Some analyses in GLM, ANOVA, VC, MIXED, RSM, and LOGIT require preliminary 
results before they can be applied. For example, hypothesis testing for general linear 
models can only be performed after estimating a model. In these cases, the dialog box 
cannot be displayed until the completion of all necessary prerequisites. 

Valid values for name follow: 
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Global features 


Data and variable dialog boxes: 


Name 


Functionality 


APPEND 
BY 
CATEGORY 
CENTER 
EXTRACT 
FREQ 
IDVAR 

IF 

LABEL 
LET 

LIST 
MERGE 
ORDER 
RANK 
RECODE 
RESHAPE 
SELECT 
SORT 
STACK 
STAND 
TRANSPOSE 
TRIM 
WEIGHT 


Append files end-to-end 

Stratify the analyses 

Identify categorical variables 
Centers the variable(s) 

Extract selected cases to a file 
Define a frequency variable 

Define an identification variable 
Conditionally transform variables 
Define value labels 

Create or transform variables 

List data for selected variables 
Merge files side-by-side 

Identify a sorting order for variable categories 
Replace data with ranks 

Recode values of variables 

Change data layout 

Select cases 

Reorder cases 

Arrange variable(s) in single column 
Standardize variables 

Transpose variables 

Delete extreme observations 

Identify a variable for case weighting 


Utilities dialog boxes: 


Name 
DECOMPOSE 
DOE 
DOEWIZ 
GENERATE 
OPERATION 
ORGANIZE 
PCALC 
PCALD 
POWCORR1 
POWCORR2 


Functionality 

Perform eigenvalues, Cholesky, QR or SV decomposition of a matrix 
Classic design of experiments 

Design of experiments wizard 

Generate matrix of different types 

Matrix operations 

Save, show and clear matrices and directory of matrices 
Probability calculator for continuous distributions 
Probability calculator for discrete distributions 

Power analysis for single correlation 

Power analysis for equality of two correlation coefficients 
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POWGENERIC 
POWONEWAY 
POWPROP1 
POWPROP2 
POWT1 
POWT2 
POWTPAIRED 
POWTWOWAY 
POWZ1 
POWZ2 
RANDSAMC 
RANDSAMD 
READ 
ROWCOLUMN 


Generic power analysis 

Power analysis for one-way ANOVA 
Power analysis for single proportion 
Power analysis for equality of two proportions 
Power analysis for one sample t-test 
Power analysis for two sample t-test 
Power analysis for paired t-test 

Power analysis for two-way ANOVA 
Power analysis for one sample Z test 
Power analysis for two sample Z test 
Univariate continuous random sampling 
Univariate discrete random sampling 
Reads matrices from file or keyboard 
Row/column operations of the last matrix 


Analyze dialog boxes: 


Name 

ACF 

ADTEST 
ADDTREE 
ADJSEASON 
ANOVA 
ANOVAHYPO 


ANOVAPOST 


ARIMA 
ARL 
BAYESIAN 
BETWEEN 
CCF 
CLSTEM 
CONJOINT 
CORAN 
CORR 
CRONBACH 
CSTAT 
CUSUM 


Functionality 


Global features 


Autocorrelation plots 
Anderson-Darling test 

Additive trees 

Seasonal adjustment of time series 
Analysis of variance 


Hypothesis tests in analysis of variance (after successfully estimating a 


model) 


Posthoc analysis in analysis of variance (after successfully estimating a 


model) 

ARIMA models 

Average run length curves 
Bayesian regression 

Between group testing for MANOVA 
Cross-correlation plots 

Stem and leaf plot for column 
Conjoint analysis 
Correspondence analysis 
Correlation and distance measures 
Cronbach's alpha 

Column statistics 

Cumulative sum charts 
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DECILES 


DISCRIM 
EWMA 
EXPONENTIAL 
FACTOR 

FITC 

FITD 
FRIEDMAN 
GLM 
GLMHYPO 


GLMPOST 
HLMM 
JOIN 
KCLUSTER 
KRUSKAL 
KS1 

KS2 
LADREG 
LMM 
LMSREG 
LOGIT 
LOGITHYPO 


LOGLIN 
LOWESS 
LTSREG 

MA 

MANOVA 
MANOVAHYPO 
MANOVAPOST 


MDS 
MISSING 
MIXHIER 
MIXEDHYPO 
MIXMULTI 
MNTEST 
MREG 


Deciles of risk in logistic regression (after successfully estimating a logis- 
tic model) 


Discriminant analysis 

Weighted moving average charts 
Exponential smoothing of time series 
Factor analysis 

Fitting distribution: continuous 
Fitting distribution: discrete 
Friedman tests 

General linear models 


Hypothesis tests in general linear models (after successfully estimating a 
model) 


Post hoc estimate for repeated measures in GLM (after estimating a model) 
Hierarchical linear mixed models 

Hierarchical clustering 

K-means and K-medians clustering 

Kruskal-Wallis tests 

One sample Kolmogorov-Smirnov tests 

Two sample Kolmogorov-Smirnov tests 

LAD regression 

Linear mixed models 

LMS regression 

Logistic regression 

Hypothesis tests in logistic regression (after successfully estimating a 
logistic model) 


Loglinear models 

LOWESS smoothing of time series 

LTS regression 

Unweighted moving average charts 

Estimate model for MANOVA 

Hypothesis test in MANOVA (after estimating a model) 


Post hoc estimate for repeated measures in MANOVA (after estimating a 
model) 


Multidimensional scaling 

Missing value analysis 

Mixed regression for data having a hierarchical structure 
Hypothesis tests in mixed models 

Mixed regression for data having a multivariate structure 
Multinormal Tests 

M regression 
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MULTIWAY 
NLLOSS 
NLMODEL 
occ 
ONEWAY 
PACF 
PAIRWISE 
PARETO 
PCA 
PERMAP 
PLS 
POISSON 
POSAC 
PROBIT 
QCREGRESS 
QHIST 
QUADE 
QUANTILES 


RAMONA 
RDISCRIM 
REGRESS 
RIDGEREG 
RRANK 
RSTAT 
RSMEST 
RSMOPT 
RSMPLOT 
RUNCHART 
RUNS 
RWSTEM 
SCORAN 
SETCOR 
SHEWHART 
SIGN 
SIGNAL 
SIMULATE 


SMOOTH 
SPATIAL 


Global features 


Multiway tables 

Nonlinear loss functions 

Nonlinear models 

Operating characteristic curves 

One-way tables 

Partial autocorrelation plots 

Pairwise mean comparisons (after successfully estimating a linear model) 
Pareto charts 

Process Capability Analysis 

Perceptual mapping 

Partial least-squares regression 
Hypothesis Tesing: Poisson test 

Partially ordered scalogram analysis 
Probit models 

Regression charts 

Histogram in Quality analysis 

Quade tests 

Quantiles in logistic regression (after successfully estimating a logistic 
model) 

Path analysis (structural equation models) 
Robust discriminant analysis 

Linear regression 

Ridge regression 

Rank regression 

Row statistics 

Response surface models 

Optimize estimated model 

Surface or contour plot 

Run chart 

Wald-Wolfowitz runs tests 

Stem and leaf plot for row 

Smart correspondence analysis 

Set and canonical correlations 

Shewhart control charts 

Sign tests 

Signal detection analysis 

Simulation in logistic regression (after successfully estimating a logistic 
model) 

Nonparametric smoothing 

Spatial statistics 
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SREG 

STD 
SURVNPAR 
SURVPARC 
TABULATE 
TESTATCL 
TESTATLOG 
TIME 
TOHCO 
TOHC1 
TOHC2 
TOHP1 
TOHP2 
TOHT1 
TOHT2 
TOHTPAIRED 
TOHV1 
TOHV2 
TOHVN 
TOHZ1 
TOHZ2 
TPLOT 
TRANSFORM 
TREES 
TRENDAN 
TSFOURIER 
TSLS 

TSQ 
TSSMOOTH 
TWOWAY 
vc 
WHISKER 
WILCOXON 
WITHIN 
XMR 


S regression 
Standardize tables 


Survival Analysis: Nonparametric 
Survival analysis: Parametric 


Crosstabulation tab 


les 


Classical test item analysis 
Logistic item response analysis 
Defines labels for time series plots 


Hypothesis Testing 


Hypothesis Testing: 
Hypothesis Testing: 
Hypothesis Testing: 
Hypothesis Testing: 
Hypothesis Testing: 
Hypothesis Testing: 
Hypothesis Testing: 
Hypothesis Testing: 
Hypothesis Testing: 
Hypothesis Testing: 


Hypothesis Testing 
Hypothesis Testing: 
Time series plots 


: Zero correlation 

Specific correlation 
Equality of two correlation coefficients 
Single proportion 

Equality of two proportions 
One sample t-test 

Two sample t-test 

Paired t-test 

Single variance 

Two variances 

Several variances 

: One sample z-test 

: Two sample z-test 


Time series transformations 
Classification and regression trees 


Trend Analysis 


Fourier decomposition of time series 
Two-stage least-squares models 
Hotelling's T? chart 

Smoothing time series 


Two-way tables 


Variance component models 


Box-and-whisker p 
Wilcoxon tests 


Within group testing for MANOVA (after successfully estimating a model) 


X-MR chart 


lots 
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Graph dialog boxes: 
Name Functionality 
BAR Bar charts 
BOX Boxplots 
DENFUNC Density functions 
DOT Dot charts 
DOTDENS Dot density displays 
FOURIER Andrews Fourier plots 
FPLOT Function plots 
GALLERY Graph Gallery 
HILO High-low-close charts 
HIST Histograms 
ICON Icon plots 
LINE Line charts 
MAP Maps 
PARALLEL Parallel coordinate displays 
PIE Pie charts 
PLOT Scatterplots 
PPLOT Probability plots 
PROFILE Profile charts 
PYRAMID Pyramid charts 
QPLOT Quantile plots 
SPLOM Scatterplot matrices 
Example: 
DIALOG PLOT 

FREQUENCY command 
FREQ var 


To save space in a data file, cases with the same values for each variable are entered as 
a single case with a count stored in var. The sample size for analyses is the sum of the 
values of var. The values of var are truncated to integers before processing. Issuing 
FREQ without an argument clears the current selection. When file TYPE is not 
RECTANGULAR, FREQ has no effect. 
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WEIGHT command 


WEIGHT var 


Weights each case with the value of var, where var contains a numeric value 
representing the degree of importance (or weight) the case should have when 
performing analyses. Typically, the weight is proportional to 1 divided by the variance, 
when such information is known. Issuing WEIGHT without an argument clears the 
current weight selection. When file TYPE is not RECTANGULAR, WEIGHT has no 
effect. 


CATEGORY command 


CATEGORY varlist 
CATEGORY varlist$ 
CATEGORY 


The CATEGORY command treats values of a numeric or character variable as 
categories. Thus, if you identify a quantitative variable as categorical, each unique 
value appears on the plot scale, the values are equally spaced (e.g., if the only numbers 
are 1, 3, and 15, they are spaced as 1, 2, 3), and the minimum and maximum data values 
are limits for the plot scale. 


The following options are available: 


/ MISS allows cases with a missing value for the categorical variable to 
be included as an additional category. 
NOMISS does not include cases with missing values while defining cate- 
gories. This is the default. 


ADD adds the list of variables to the existing list. This is a default 
option 

REPLACE replaces all existing category variables with varlist or varlist$. 

OFF removes varlist variables from the existing category list. 


? lists in the output all variables defined as categorical. 
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VARLABEL command 


Global features 


VARLABEL var / ‘text 
VARLABEL var$ / ‘text’ 


Assigns label in fext to the variable var or var$. 


LABEL command 


LABEL var /n1="'text!',n2='text2’,... 
LABEL var$ // ‘oldtext]'=newtext! 


Use to: 


ш label values 


W assign a character name to each code for use as a label in the output-cases with 
unspecified codes are omitted from subsequent graphical displays or analyses that 


use the variable 


m assign new labels for string variables 


You can assign label for missing code values.Use the following options: 


| ni-'textl', п2=їехі2',... 


п1,п2,...='1ехі1', 
n3,n4,...='text2',... 


n1..n2-'textl", 
n3..n4='text2',... 


.. п2='1ехі1', 
.. N4="'text2', 
.. N3='text3',... 


'oldtextI'2newtextI 
'oldtext2'2newtext2, ... 


var or 'var$' 


ma m. 9 * 


m-— _ | 


the labels for codes n1, n2, ... in the data file become textl, text2, ..., 
respectively 

the lables for cases with codes n1, n2, ... become text1, the labels for 
cases with codes n3, n4, ... become fext2, etc. (The codes do not 
have to be consecutive values or specified in numerical order). 

the labels for the codes 711, n2, and all codes between them become 
text/; the labels for codes 3, n4, and all codes between them 
become fext2, etc. 

for some variables, this is a shortcut notation for the previous syn- 
tax. The labels for the code n1 and all codes less than п1 become 
text]; the labels for code n2 and all codes between n1 and n2 
become text2, etc. 


assigns new names to string codes 


turns off the previous LABEL specification 


06.2910 
002. 


fü 
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Example: 
LABEL DRUG / 1='HEAVY USE’, 2-'FREQUENT USE’, 3='NEVER USED' 


IDVAR command 


IDVAR var 
IDVAR var$ 


Replaces case numbers with values of var or var$ in both the data editor and in listings 


of casewise results. For the latter, use IDVAR to insert a case label before the other 
values for each case. 


VIEW command 


(HOT) VIEW filename 


Displays a datafile named filename in noneditable mode. 


NAMES command 


(HOT) NAMES 


Displays variable names for the current file. 


1 COMMENT displays the file comments after the variable names. 
DICTIONARY displays the file comments, variable names, variable comments, 
and value labels. 
GENLAB displays the label definition (given in LABEL command) for all 
variable(s). 


GENLAB = "filename.syc" writes the label definitions for all variable(s) to the specified file. 


2; « e. 4 
V" eem А 
sam ы 
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OPTIONS command 


(HOT) OPTIONS 


Displays options currently in effect. 


HELP command 


(HOT) HELP procedure 
To access HELP on a procedure, type HELP followed by the procedure name. If you are 


already in the procedure that you want HELP on, just type HELP, or HELP followed by 
the name of a command. 


CALCULATE command 


(HOT) CALC expression 


SYSTAT’s internal calculator, which returns results of equations constructed using any 
of the SYSTAT functions and operators. The calculator uses the numbers you enter in 
the equation. It does not know variable names and does not read their data values. 


Examples: 
CALC NOW$('MMM dd, hhhh:mm’) 
CALC EXP(1.826) 


RSEED command 


RSEED n 


Specifies the random seed. For Wichmann-Hill generator, п should be between 1 and 
30000. If n is not specified, SYSTAT uses its own. For Mersenne-Twister, n should be 
between 1 and 4294967295; otherwise it is considered based on system time. 
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RNDGEN command 


RNDGEN MT 
WH 


Specifies the random number generator to be used, Mersenne-Twister (MT) or 
Wichmann-Hill (WH).The default is MT. 


PUSH command 


PUSH envcommand 


Saves the current state or value for envcommand in а 'last-in-first-off stack. 
Environment command states that can be saved to the stack include: 


Command State 

CLASSIC ON or OFF 

ECHO ON or OFF 

FORMAT current numerical output format 

FPATH all path prefixes automatically appended to file 
names 

GRAPH whether or not Quick Graphs are on 

PAGE WIDE or NARROW 

PLENGTH NONE, SHORT, MEDIUM, or LONG 

RSEED current seed for random number generation 


MONOCHROME ОМ or OFF 


Use PUSH ALL to store the states for all of the above commands in the stack. 


Restore the values saved in the stack using the POP command. The 'last-in-first-off 
nature of the stack requires that restoration occur in the reverse order of how the states 
were saved. 


Example: 
PUSH FPATH 

Saves all current paths defined by the FPATH command to a command stack for later 
restoration. 
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POP command 


POP envcommand 


Removes the last saved value from a stack of command states created by PUSH and 
sets envcommand to that value. Commands that can have their states set by POP include 
CLASSIC, ECHO, FORMAT, FPATH, GRAPH, PAGE, PLENGTH, RSEED, and 
MONOCHROME, Use POP ALL to reset all of these commands to their corresponding 
stack entries, if any. 


Example: 


POP PLENGTH 
Sets the print length to the value currently on the top of the stack created by PUSH. 


FPATH command 


FPATH prefix 


Lets you define an automatic path prefix to append to file names. Use it to read a file 
from or to direct output to a specific directory or device. You do not have to enclose 
the path in quotation marks unless the path contains spaces. 

By default, SYSTAT uses the specified prefix for all subsequent file manipulations, 
regardless of the type of file being accessed or saved. However, there are fourteen 
types of files for which you can specify prefixes individually. You can specify the same 
prefix for multiple file types; that is, you can store many different file types in the same 
directory. 
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Using FPATH with no path prefix sets all file paths to the SYSTAT program folder. 
To set multiple paths to the SYSTAT folder without resetting all paths, use the 
SYSTAT program folder as the path prefix. 


/ PROJECT sets prefix as the root directory under which sub-folders Gal- 
lery, Data, Command, and Output will be created. 
GALLERY Graph Gallery files. 
USE SYSTAT input data (.SYD, .SYZ and .SYS) files. 
SAVE SYSTAT saved data (.SYD, .SYZ and .SYS) files. 
WORK temporary SYSTAT data (.SYD, .SYZ and .SYS) files. 
IMPORT All imported data files. 
EXPORT All exported data. 
SUBMIT SYSTAT command (.SYC) files. 
OSAVE SYSTAT output (.SYO and .MHT)files. 
OUTPUT ASCII output (.DAT) files for OUTPUT commands. 
GSAVE All saved graphs files. 
GET ASCII input (.DAT) files for GET commands. 
PUT ASCII output (.DAT) files for PUT commands. 
HTML HTML output files. 
RTF Rich-text format output files. 
Examples: 
FPATH D: / SAVE save all SYSTAT data files to device D: 


FPATH \MYDATA\/ USE GET SAVE place all SYSTAT input (DAT, SYD, and .SYS) 
and saved data (SYD) files in MYDATA 

FPATH C:\USR\SYSTAT\/ SUBMIT read and save all SYC files using the \usr\systat\ 
directory on drive C: 

FPATH 'CAPROGRAM direct SYO and HTML (.MHT) output files to 

FILES\SYSTAT12\' /, OSAVE the SYSTAT Program folder. 

FPATH'CAMY PROJECT'PROJECT create sub-folders Gallery, Data, Command, and 
Output within C:\My Project, and set the file 
paths to these for the respective file types. 
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EXIT command 


EXIT 


Exits from any module in which you are working. 


QUIT command 


QUIT 


Closes SYSTAT. 


KILL command 


KILL filename 
KILLS 


Deletes the specified file from the user's system. Include the path to the file if the file 
is not stored in the SYSTAT folder. Use the wildcard character (*) to delete multiple 
files. Deletion of each file must be confirmed if KILL is used interactively. Running 
KILL from a command file results in the designated file(s) being deleted without 
confirmation. 


Examples: 


KILL MYFILE.DAT 
Deletes MYFILE.DAT from the SYSTAT folder. 


KILL 'C:\SYSTAT\PROJECT 1\*.ВМР' 
Deletes all bitmaps in the specified folder. 


WINDOW command 


WINDOW nnnn 


Sets the first year of a 100 year range covering all two digit years. To assign a ce 
SYSTAT uses the year from the range that, when truncated, corresponds to the 
„onnar w x. RAD 
Эме... — — ae 


OTUUA -—-—-———— 
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digit year. For example, a 100 year range beginning at 1950 ends at 2049. SYSTAT 
treats a two-digit year of 34 as 2034 (1934 falls outside the century range) and a two- 
digit year of 99 as 1999 (2099 falls outside the range). 


Restructuring files 


SORT command 


(HOT) SORT varlist 


Sorts the cases in your file using values of up to 10 numeric or string variables (variist). 
Missing values are placed at the beginning. String variables are sorted in ASCII order. 
If you specify more than one variable, SYSTAT executes a nested sort. For a complete 
listing of the ASCII sort order, see SYSTAT: Data. 


/ DorA sorts in ascending or descending order. Specify an A or a D for each vari- 
or a list of able in varlist. A is the default. 
D's and A's 
MERGE command 


(HOT) MERGE file1 (varlist1) file2 (varlist2) 


Merges cases in file1 side by side with those in file2. In the absence of a designated path, 
the software searches for the files in the directories defined by FPATH for input data 
files (USE), temporary data files (WORK), and output data files (SAVE). 

Use the optional varlist if you want to merge selected variables. By default, SYSTAT 
joins the first record in file1 with the first record in file2, the second record in file1 with 
the second record in file2, etc. If a variable var exists in both files, the merged file 
contains two variables VAR FILEI and VAR_FILE2. As an option, you can use the 
value of a key variable to link records.. 


| keyvars SYSTAT joins cases that have the same values for the key vari- 
able(s). If no match is found, the new record is filled with missing 
value flags. When multiple occurrences of a key value are found, 
SYSTAT replicates the values from the other file. 


29 


APPEND command 


Global features 


(HOT) APPEND file1 file2 


Merges files end to end. In the absence of a designated path, SYSTAT searches for the 
files in the directories defined by FPATH for input data files (USE), temporary data files 
(EWORK), and output data files (ESAVE). 

Cases in file2 follow those in file1 and the variables should be in the same order. Issue 
а DSAVE or an ESAVE command (for the resulting file) after specifying APPEND to 


save the file. 


| INTERSECTION includes variables common to both file and file2 in the new file. The 


MATCH 


UNION 


Example: 


resulting file will not contain variables appearing in only one ofthe files 
being appended. 

includes all variables appearing in both file and file2 in the new file. 
The files must contain the same variables in the same columns for the 
two files. If not, SYSTAT issues an error. 

includes all variables appearing in file and file2 in the new file. Vari- 
ables appearing in only one of the files being appended receive missing 
values for all appended cases from the other file. 


Assuming file has variables A, B, and D, and file2 has variables A, B, and C: 


m INTERSECTION includes variables 4 and B in the new file. 


m MATCH yields an error message because file1 and file2 do not contain the same 
variables in the same order. 

m UNION includes variables А, B, C, and D (values of C are missing value codes for 
file1, and values of D are missing value codes for file2). 
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EXTRACT command 


(HOT) EXTRACT filename 


Executes transformations and saves only those cases from the current SELECT 
statement їп а SYSTAT data file, filename.SYZ. You do not have to enclose the filename 
and path in quotation marks unless the path or name contains spaces. 


| VARIABLES = varlist extracts selected case(s) of variable(s) specified in varlist 


TRIM command 


(HOT) TRIM variist 


Trims cases for each numeric variable named in varlist. The default variist is all 
numeric variables in the file. Use ESAVE to store results to a file. 


| TRPROP=p specifies proportion of the data to be trimmed [(0<p<1) for lower or 
upper and (0<p<0.5) for two-sided trimming]. The default is 0.10. 


TREGION=TWOSIDED trims extreme observations from both sides. This is the default option. 
UPPER trims only upper extreme observations. 
LOWER trims only lower extreme observations. 


METHOD=SEPARATE removes the specified proportion of extreme observations separately 
for each of the selected variables. This is the default trimming method. 


LISTWISE excludes the specified proportion of extreme observations separately 
for each of the selected variables at the first step, then completely 
excludes all those cases, which have observations for at least one of 
the selected variable(s) excluded. 


OFF turns off case trimming so that all cases are used in subsequent 
analysis. 
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WRAP command 


(HOT) WRAP varlist 


Wraps your data in order to easily change a multivariate repeated measures layout to a 
split-plot data layout. Strings each case over multiple records according to number of 
variables in varlist. The following table illustrates three variables and two cases 
wrapped using all three variables. 


12:3 becomes 1 

456 2 
3 
4 
5 
6 

UNWRAP command 
(HOT) UNWRAP n 


Unwraps your data in order to easily change a split-plot data layout to a multivariate 
repeated measures layout. Packs each block of n cases on a single record. The 
following table illustrates one variable with six cases, unwrapped with three entered as 
the value. 


1 becomes 123 
2 456 
3 
4 
5 


a 
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TRANSPOSE command 


(HOT) TRANSPOSE varlist 


Transposes a data file by turning rows (cases) into columns (variables) and vice versa. 
varlist is optional, the default is all numeric variables in the file. 


STACK command 


(HOT) STACK varlist 


Arranges the observations from the specified variables into a single column. 
Editing data files 


REPEAT command 


REPEAT n 


Adds n cases to the end of the active data file. 


INSERT command 


INSERT m,n 


Inserts n (>=1) blank rows or columns beginning at the m" row or column. 


/ ROWS Inserts n cases beginning at the m" case. 
COLUMNS Inserts n variables beginning at the т" variable. 
NAMES=namelist ^ Gives names in namelist to inserted variable(s). 
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DEFVAR command 


(HOT) DEFVAR variist 


Defines a variable as numeric (including dates, times, and currency) or string. Variable 
names may contain up to 256 letters or numbers, but must begin with a letter. AII 
variable names that do not include the dollar sign are defined as numeric. Along with 
letters and numbers, the underscore character ( ) may be used to indicate a space, but 
no other characters or spaces may be used within a variable name. However, 
subscripted variables allow a left parenthesis followed by a positive integer followed 
by the right parenthesis. 

You can change a variable from being numeric to string and vice versa. However, 
if you change a variable from string to numeric, SYSTAT converts all non-numeric 
data to missing values. Cases with numeric data will keep their numeric information. 
If a case has a combination of numeric and string data, the entry is converted to a 


missing value. 


Г TYPE = STRING defines all variables in varlist to be string variables. 
The names of string variables must end with a dollar 
sign ($) and this symbol counts as one of the 256 
characters of the name. Values of string variables must 
be 256 characters or fewer. 


NUMBER defines all variables in varlist to be numeric. 

DATE defines all variables in varlist to be dates or times. 

EXPONENTIAL prints numeric values in exponential notation. 
DISPLAY= m.n specifies the format for numeric variables. Specify the 


number of characters (m) and the number of decimal 
places (n). 
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COMMENT= 


Examples: 


“dateformat’ 


n 


"text" 


specifies the format for dates or times. Designate how 
many characters to display in the date by repeating the 
following characters. For dates, use d for days, M for 
months, and y for years. For times, use s for seconds, 
m for minutes, and h for hours. 

For example, typing MM for the month yields two 
characters in the representation of months (i.e. 
September would look like 09). By typing MMM, the 
month is abbreviated using three letters (September 
would be Sep"). Specifying four or more characters 
yields the entire name. Day and day of the weck 
displays also follow this format: two d's display the 
day in digits, three d's display the day of the week in 
the abbreviated form (Tues, etc.), and four d's display 
the day of the week using the complete name 
(Tuesday). 

Valid date and time formats are: 


Valid date and time formats are: 


MM/dd/yyyy 
dd-MMM-yyyy 
dd.MM.yyyy 
yyyy ddd 
MMM yyyy 
hh:mm 
HH:mm 


hh:mm:ss 

HH:mm:ss 

ddd hh:mm 

ddd hh:mm:ss 

Monday, Tuesday, ... 

Mon., Tue., Wed., ... 

January, February, ... 

Jan., Feb., Mar., ... 

dd-MMM-yyyy hh:mm:ss 
specifies the format for string variables. Specify the 
number of characters (n <= 256). 


adds comments to the variables in varlist. The 


specified text is displayed when opening a data file or 
via NAMES / COMMENT. 


DEFVAR START / TYPE=DATE DISPLAY-'MM/DD/YYYY" 
DEFVAR SIZE / TYPE-EXPONENTIAL DISPLAY-12.3 
DEFVAR ANSWERS / TYPE=STRING DISPLAY=10 COMMENT-'TESTVAR" 
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DIM command 


(HOT) DIM var(n) 


Reserves space for a new variable var with subscript n. 


DROP command 


(HOT) DROP varlist 


Prevents the variable(s) given by varlist from being written to the saved file. 


DELETE COLUMNS command 


(HOT) DELETE COLUMNS - varf, var2, var3.. var4 
Removes the specified variable(s) from the active data file to clipboard. 


DELETE ROWS command 


(HOT) DELETE ROWS = nf, n2, n3..n4 
Removes the specified case(s) from the active data file to clipboard. 


CUT command 


(HOT) CUT varlist 


Removes the variables in varlist from the active data file to clipboard. 


PASTE command 


(HOT) PASTE nloc 


Pastes the contents of the clipboard to the location specified by nloc. If the clipboard 
contains variables (via CUT), nloc defines the column number from which the 
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variable(s) to be inserted. If the clipboard contains cases (via DELETE), nloc defines the 
row number at which to paste the data. The first case is row1; the first variable is 
column. 


Transforming data 


LET commandl 


LET var = expression 


Assigns the value of expression to the variable var. You can use either numeric or string 
variables. String values must be enclosed in apostrophes or quotation marks. 


CENTER command 


(HOT) CENTER varlist 
Centers each numeric variable named in varlist around the mean for that variable. The 


default is all numeric variables in the file. Use ESAVE to store results in a file. To center 
values around group means, issue BY before CENTER. 


STAND command 


(HOT) STAND variist 


Standardizes each numeric variable named in variist. The default is all numeric 
variables in the file. Use ESAVE to store the standarized results to a file. 


| SD replaces values of each variable with its sample standard score (z score). 


RANGE for each variable, subtract the smallest data value from each value and 
divide by its range to form a 0,1 scale. 
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RANK command 


(HOT) RANK varlist 
Transforms all numeric variable(s) in varlist to ranks. Each variable is ranked within its 


own distribution, and the rank replaces the data value. The default is all numeric 
variable(s) in the file. Use ESAVE to store results in a file. 


RECODE command 


(HOT) RECODE argument / specification 


Recodes variable(s) mentioned in argument according to specification; argument can 
be one of the following:: 


Argument Result 
varlist or varlist$ Overwrites the values of variables in уап! or varlist$ accord- 
ing to the coding specified in specification. 


varlist=var, varlist$=var$, Creates variables in varlist or varlist$ applying the ified 

varlist = var$, or varlist$=var coding on var or var$. The variable var or va used for coding 
is not modified. Variables in varlist or varlist$ that are already 
present will be overwritten. 


In general, specification consists ofa list of assignments separated by commas. The 
values on the left hand side of the assignment operator are replaced by the value on the 
right hand side. For recoding a numeric variable, i.e., for the arguments varlist or 
varlist=var, use any of the following: 


n1=m1, n22m2,... the value n1 is replaced by mf, the value n2 is 
replaced by m2, ... 

n1, n2,...=m1, п3,п4,...=т2,... values n1, n2, ... are replaced by mf, values 
n3, n4, ... are replaced by m2, ... 

п1..п2= m1, n3..n4= m2, ... values between n1 and n2 are replaced by mf, 


values between n3 and n4 are replaced by 

m2, ... . In other words, this allows you to dis- 
cretize a continuous variable, i.e., define inter- 
vals along a continuous variable. 

„п1=т1, ..n22m2, ... values less than or equal to n1 are replaced by 
m1, values greater than n1 but less than or 
equal to n2 are replaced by m2, ....Here 
n1,n2,... have to be in the increasing order. 
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For recoding a string variable, i.e., for the arguments varlist$ or varlist$-var$, use any 
of the following: 


oldtext1'—'newtextl', 'oldtext2'—'newtext2" the string oldtext] is replaced by newtext1, the 
string oldtext2 is replaced by 
newtext2, ... 

'oldtextl', 'oldtext2", ...='newtextl', 'oldtext3, ^ values oldtextl, oldtext2,... are replaced by 

‘oldtext4', ...—'newtext2",... newtext1, values oldtext3, oldtext4, ... are 
replaced by newtext2, ... 


You can use an appropriate combination of the above specifications for recoding 
numeric variables into string, or vice versa. You can also use missing values in the 
specifications, i.e., the dot (.) for numeric missing values, and ' ' for string missing 
values. 


Examples: 
RECODE QUE(1 .. 10) / 170, 271, . = 99 
RECODE GENDER = SEXS$ / 'FEMALE'-0, 'MALE'=1 


RECODE EDUCA$, EDUCB$=EDUCATN / 1,2='HS DROPOUT', 
3='HS GRAD', 4,5-'COLLEGE', 
6,7-'DEGREE + 
RECODE АСЕ$=АСЕ / ..29-'18 TO 29', 
30 .. 45 ='30 TO 45', 
46 .. 60 ='46 TO 60', 60... -'OVER 60' 


Case selection and grouping 


SELECT command 


SELECT exprn] AND exprn2 OR exprn3... 
COMPLETE 


CASE = expression 


Selects a subgroup of case(s) for analysis. Connect expressions with logical AND or OR 
and use parentheses for clarity. Only cases meeting all of the expression conditions are 
used. Use SELECT without an argument to end selection condition. Specify SELECT 

COMPLETE to include only those cases with no values missing. Use the built-in case 
sequence variable CASE with expressions like SELECT CASE « 100. 
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Example: 
SELECT AGE > 21 AND (CITY$ = 'Boston' OR CITY$ = 'Chicago’) 


BY command 


BY grpvar1, grpvar2,... 


Specifies grouping variables for subgroup analysis. Analyses are performed separately 
for each combination of levels of the grouping variable(s). Cases need not be sorted by 
the values of the grpvars. Issuing BY without an argument clears the current BY groups 
selection.You can select up to 10 grouping variables. 


| MISS when one or more values of a BY variable are missing, do not include a group for 
"missing." 


Displaying, formatting and saving outputs and graphs 


OUTPUT command 


OUTPUT device 
filename 


Sends all subsequent statistical text results to the specified device (VIDEO or 
PRINTER), or to a file, as specified by filename. Use OUTPUT without an argument to 
stop sending output to the device. Use the following arguments: 


Device Sends output to 
VIDEO screen. Can also use *. This is a default. 
PRINTER sends the content of Output Editor to printer. 


filename file named filename in ASCII (text) format (and the screen). 
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You can prevent statistical output from appearing on screen using the following option. 


| NOSCREEN suppresses output from appearing in the Output Editor. Use 
before statistical procedures to save resulting statistics to a data 
file for subsequent processing without displaying the initial out- 
put. Additionally, sending output directly to file without display- 
ing it on screen can result in significantly faster processing for 
large jobs. 


ECHO command 


ECHO ON 
OFF 


With ECHO mode ON, every submitted command appears in the Viewspace 
immediately before the resulting output (if any). Use this processing option to maintain 
a single output file consisting of commands, statistical output, and graphs. The last 
setting (ON or OFF) of this command remains in effect when SYSTAT is started for the 
next session. 


PAGE command 


PAGE NARROW 
WIDE 


Selects screen display and printer output format and characteristics with the options 
that follow. 
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PLENGTH command. 


PLENGTH NONE 
SHORT 
MEDIUM 
LONG 


Prints defined sets of output for some statistical procedures (SHORT, MEDIUM, or 
LONG groups). You can also use DISPLAY instead of PLENGTH 


| argument some procedures have specific features that can be requested as options for 
PLENGTH. This allows you to add to the set selected with the argument. 


CLASSIC command 


CLASSIC ON 
OFF 


With CLASSIC mode ON, all subsequent output appears in a text like format without 
using the tables in the output. Output generated with CLASSIC ON appears in the fixed 
width font specified in the Edit: Options dialog (by default Courier New). With 
CLASSIC mode OFF, output appears in formatted tables in the designated proportional 
font. The last setting (ON or OFF) of this command remains in effect when SYSTAT is 
started for the next session. 


FORMAT command 


FORMAT m,n 


Specifies the format for each numeric variable—m is number of character spaces in 

each field for data listings and matrix layouts (0<=m<=23); nis ће number of digits 
following the decimal point (0 <= п < = 14). The default is 12 for m, 3 for n. You can 
specify n alone. A number that would otherwise violate the specified field width will 
be converted to exponential notation while maintating the number of decimal places. 


/ UNDERFLOW prints in exponential notation tiny numbers that would otherwise 
appear as 0. 
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VDISPLAY command 


VDISPLAY argument 


Displays label information of variable(s) for subsequent output. 


LABELS displays variable labels. This is the default. 
NAMES displays variable names. 


BOTH displays variable names as well as variable labels. The variable var is 
shown as /abel(var). 


LDISPLAY command 


LDISPLAY argument 


Displays label information of variable(s) for subsequent output. 


LABELS displays variable labels.This is the default. 
DATA displays values. 
BOTH displays values as well as value labels. The variable var is shown as 


var) label. 
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ORDER varname or varlist 


Orders the values or value labels of varname or varlist. The following options are 


available: 


| SORT = 
NONE 


ASC 
DESC 


FASC or 
FDESC 


n1,n2,... ог 
val1,val2 
MISS 


DATA 
LABEL 


Example: 


defines the order of values or value labels. 

values or value labels are ordered as SYSTAT first encounters them in the 
data file. 

numeric values or value labels are ordered from smallest to largest, and 
string values or value labels are ordered alphabetically. This is the default. 
numeric values or value labels are ordered from largest to smallest, and 
string values or value labels are ordered backward, alphabetically. 

orders values or value labels by the frequency of cases within each, plac- 
ing the category of greatest frequency first. Use FASC for an ascending 
sort and FDESC for a descending sort. 

you specify the order for numeric or string values or value labels (for 
example, low, medium, high). 

when the value of var or var$ is missing, includes an additional category 
for missing. 

indicates that SORT applies to data values. This is a default. 

indicates that the SORT applies to the labels specified with the LABEL 
statement. 


ORDER DRUG / SORT='NEVER USE’, FREQUENT USE', 'HEAVY USE' 


GRAPH command 


GRAPH NONE 


Controls the display of Quick Graphs (graphs that are automatically generated in 
statistical output). Typing GRAPH NONE turns the Quick Graphs off. GRAPH without 


an argument turns them back on. 
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GPRINT command 


GPRINT 


Prints graphs from within your syntax jobs for production mode. SYSTAT maintains 
an internal index of graphs appearing in the output window; the newest graph assumes 
an index of 1 and the oldest graph assumes an index of n. Creation of a new graph resets 
this index to ensure that the most recently created graph receives a value of 1. 

GPRINT acts on the most recently created graph. We recommend issuing GPRINT 
immediately after the command that generates the graph to be printed. However, by 
issuing consecutive GPRINT commands, you can print several graphs. SYSTAT prints 
the most recent graph first, the graph created before the most recent graph second, and 
so on. Issuing any other command after a GPRINT command resets the internal index 
for the next GPRINT to the most recent graph. 


| POTRAIRT defines the orientation of the graph on the printed page. LANDSCAPE 
LANDSCAPE results in a horizontal page (wider than it is high). PORTRAIT results 
in a vertical page (higher than it is wide). If unspecified, the orienta- 
tion corresponds to the setting defined in the Page Setup dialog of the 
Graph window. 


ALL sends all graphs in the output window to the printer. All printed graphs 
use the same page orientation. 


Example: 


Suppose the output contains a bar chart and a line chart, with the latter appearing after 
the former. The following commands print four graphs: 


Command Sequence Printed Output 
GPRINT / PORTRAIT line chart 
GPRINT / PORTRAIT bar chart 

PLOT Y*X (nothing) 
GPRINT / LANDSCAPE scatterplot 
GPRINT / LANDSCAPE line chart 


Notice that by creating a new graph, we reset the index for the most recent graph. 
Consequently, we print the line chart twice, but in different orientations. 
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OSAVE command 


(HOT) OSAVE filename 


Saves all output generated before issuing the OSAVE command to the file filename. The 
default format is SYSTAT output (.SYO). 


| SYO saves the output in SYO format 
RTF saves the output in rich text format. 
HTML saves the output in HTML format (hyper text mark up language). 
MHT saves the output in MHT format. 


GSAVE command 


GSAVE filename 


Saves graphs in a specified format from within your syntax jobs for production mode. 
SYSTAT maintains an internal index of graphs appearing in the output window; the 
newest graph assumes an index of 1 and the oldest graph assumes an index of n. 
Creation of a new graph resets this index to ensure that the most recently created graph 
receives a value of 1. 

GSAVE acts on the most recently created graph. We recommend issuing GSAVE 
immediately after the command that generates the graph to be saved. However, by 
issuing consecutive GSAVE commands, you can save several graphs. SYSTAT saves 
the most recent graph first, the graph created before the most recent graph second, and 
so on. Issuing any other command after a GSAVE command resets the internal index 


for the next GSAVE to the most recent graph. 


| format specifies the format for the saved graph. Valid formats include WMF, EMF, 
EPS, PS, JPEG, PICT, BMP, CGM, TIFF, GIF, or PNG. 
ALL saves all graphs in the output window in the designated format. To generate 


unique filenames, SYSTAT appends consecutive integers beginning with 1 
to the specified name. If SYSTAT encounters an existing graph having the 
same name as a graph being saved, the newer graph replaces the older 


graph. 
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Examples: 
GSAVE MYGRAPH / CGM 

Saves most recent graph as a computer graphics metafile, mygraph.cgm. 
GSAVE RESULT / WMF ALL. 

Saves graphs in the output as result].wmf, result2.wmf, ... 
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Programming Structures 


Built-in variables 


These variables are automatic (internal) variables, used in an IF...THEN statement that 
provide information about where you are in a file. 


BOF beginning of file 

EOF end of file 

BOG beginning of group (when a BY command is in effect) 
EOG end of group (when a BY command is in effect) 
CASE current case (observation) number 


Temporary variables 


The variables which are created in memory temporarily for doing data-related 
operations are called temporary variables. You can assign either numeric or string 
values to these temporary variables without opening a data file. Like data variables of 
string type, temporary string variables also should end with dollar ($) sign. Assigning 
a temporary variable to a data variable is possible, the reverse is possible only using 
DATA or DATAS functions. You can clear these variables from memory using CLEAR 
VARIABLES command. The following is the syntax for creating temporary variables: 


var = numeric value 
var$ = ‘string value’ 


Example: 


NEW 
REPEAT 1 
X=5 

Y=2 
LETZ=X+Y 
PRINT Z 


CLEAR VARIABLES command 


CLEAR VARIABLES =name1, пате2... 


Clear temporay variables name1, name2, ... from the memory. 
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IF...THEN command 


With the IF... THEN statement, you can execute actions conditionally. The syntax for an 
IF...THEN statement is: 


IF condition THEN expression 
IF condition THEN action 


The action that follows THEN can include LET, DELETE, PRINT, BEGINBLOCK and 
another IF... THEN. 


For Example: 


IF condition THEN LET variable = expression 
IF condition THEN PRINT variable and/or string 
IF condition THEN DELETE 
IF condition THEN IF condition THEN 
IF condition THEN BEGINBLOCK 

action 

ENDBLOCK 
ENDIF 


Note: For printing more than one variable PRINT command must be given on a new 
line. 


FOR...NEXT command 


FOR index=n1 TO n2 STEP n3 
statements 
NEXT 


Starts a FOR...NEXT loop. /ndex must have a temporary numeric variable name. You 


must specify n1, but n2 is optional. You can optionally specify an increment value with 
STEP n3. You can specify any real number or expression for n1, n2, or n3. 


Example: 


FOR K=1 TO 12 
LET X(K) = LOG(X(K)) 
NEXT 
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WHILE...ENDWHILE command 


WHILE (condition) 
commandlist 
ENDWHILE 


Starts a WHILE...ENDWHILE loop. Only temporary variables are allowed to use in 
condition. 

Example: 

A=1 

WHILE (A<=5) 

PRINT A 

A-A*1 

ENDWHILE 


ELSE command 


ELSE statement 


Follows an IF...THEN command. statement is executed when the IF condition 
evaluates as false. The statement can be any valid command, including another 
IF... THEN command. 


Example: 


IF AGE>25 THEN LET GROUP=2, 
ELSE LET GROUP=1 


ARRAY command 

ARRAY name / varlist 

ARRAY name(dim) 

name / varlist aliases the variables in varlist as an array of subscripted variables. The 
set of variables has the name name, and the variable names have inte- 
ger subscripts 1 through n, where n is the number of variables in уап- 
ist. 

name(dim) creates a temporary array with name name and dimension dim. 


Use CLEAR ARRAYS command to remove temporary arrays 
from memory. 
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Examples: 
ARRAY GRADE / MATH, VERBAL, ANALYTIC 


ARRAY X(5) 
FORK-1TO 5 
X(K)= ZRN(0,1) 
PRINT X(K) 

NEXT 

CLEAR ARRAYS = X 


CLEAR ARRAYS command 


CLEAR ARRAYS -name1, пате2... 


Clear arrays name1, name2, ... from the memory. 


FUNCTION...RETURN command 


FUNCTION name(arguments) 
commandlist 


RETURN expression 
) 


Creates a function having name name and arguments as arguments.The name should be 
different from existing SYSTAT functions. The defined function should return a 
numeric value.Within a function only temporary variables can be used. 


Example: 


FUNCTION MYERN(N) 


{ 

SUM =0 

FOR|=1TON 

SUM = SUM + ERN(0,1) 
NEXT 

RETURN SUM 

} 

NEW 

FOR|=1TO3 


W = MYERN(I) 
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PRINT W 
NEXT 


Resampling 


Resampling is a powerful way to produce estimates of parameters in samples taken 
from unknown probability distributions. SYSTAT provides three resampling 
procedures Bootstrap (BOOT), Simple random sampling without replacement 
(SIMPLE), and Jackkniffe (JACK) as a single option to the ESTIMATE command or its 
equivalent in each module. The computations are handled without producing a scratch 
file of the bootstrapped samples, which saves disk space and computer time. Quick 
graphs are turned off for resampling. Resampling is currently available for the 
following modules: 


ANOVA CLUSTER* CONJOINT CORAN 
CORR* DISCRIM FACTOR LOGLIN 
GLM MANOVA MISSING NONLIN 
NPAR* POSAC REGRESS SETCOR 
SMOOTH SPATIAL* STATS* TESTAT 
TREES TESTING* XTAB* 


*Indicates modules that do not have an ESTIMATE command. SAMPLE should bean 
option to whatever HOT command produces the statistical output. 


| SAMPLE-BOOT(m,n) the argument m is the number of samples; the argument n is 
SIMPLE(m,n) the size of each sample. The parameter n is optional and 
JACK defaults to the number of cases in the file. BOOT generates 
samples with replacement. SIMPLE generates samples with- 
out replacement. JACK generates a jackknifed data set. 


In Basic Stastistics, CORR, and REGRESS modules, SY STAT provides a summary 
based on resampling. For further details, see the respective pages. 
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Functions and Operators 


SYSTAT provides functions and operators that can be used in LET, IF... THEN LET, and 
SELECT statements, such as: 


LET var =expression 
IF condition THEN LET action 
SELECT condition 


These statements can be specified as part of any procedure. 


Relational and logical operators 


Use these operators to compare two variables, functions, or constants. The result is true 
(1), false (0), or missing. 


IF AGE < 21 THEN DELETE 
IF CAR$='BMW' AND INCOME > 40000 THEN LET STATUSS$-'trendy' 


Operator Result Operator Result 
- equal to <= less than or equal to 
< less than >= greater than or equal to 
> greater than AND boolean “and” 
<> not equal to OR boolean “or” 
NOT boolean “negative” 


Built-in variables 


LET TIME = CASE 
LET GOODCASE = COMPLETE 


Variable Result 
CASE case number 
COMPLETE 1, if all data are present; 0, if any data are missing 
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Functions for LET and IF...THEN LE T statements 


Functions modify a (a variable, number, or expression you place inside parentheses); 
for example, 


LET LG_WT = LOG(WEIGHT) 


SYSTAT sets the results of inadmissible operations (for example, dividing by 0) to 
missing. If a is missing or does not meet the condition in brackets ([ ]), the result is 


missing. 

Function Result 

SQR (a) square root of a [a > 0] 

LOG (a) natural logarithm of a [a> 0] 

L10 (a) logarithm base 10 [a > 0] 

EXP (a) exponential function e* 

LAG (var,n) lag variable by shifting values down n rows. If n is omitted, default is 1. 
INT (a) integer part of a 

LGM (a) log gamma: LGM(a) = LOG( T(a)) = LOG((a-1)!) 

SGN (a) _l ifa<0,0ifa=0, and 1 ifa>0 

ABS (a) absolute value |а| 

SIN (a) sine of a (in radians) 

COS (a) cosine of a (in radians) 

TAN (a) tangent of a (in radians) 

ASN (a) arcsine of a (which yields radian results) 

ACS (a) arccosine of a (which yields radian results) 

TNH (a) hyperbolic tangent ofa 

ATN (a) arctangent of a (which yields radian results) 

ATH (a) arc hyperbolic tangent of a (Fisher's z) (which yields radian results) 
AT2 (a,b) arctangent with sign (a) and cosine (b) argument 


MOD (a,b) the remainder of a/b 
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Multivariable functions 


LET TOTAL=SUM(QUIZ1,QUIZ2,FINAL, FINAL) 


Function 
MIS(y1,y2,...) 
NUM(y1,y2,...) 


AVG(y1,y2,..-) 
STD(y1,y2,...) 
MIN(y1,y2,... 
MAX(y1,y2,...) 
SUM(y1,y2,...) 
SLE(y1,y2,...) 


м 


SLU(x1,y1,x2,y2,...) 
ARE(y1,y2,...) 


ARU(x1,y1,x2,y2,...) 
COD(number,var1,var2,...) 


COD(‘charval ',var1$,var2$,...) 


INC(number, var1,var2,...) 


INC(‘charval ',var1$, var2§, ...) 


Result 


number of missing values 


number of values that are not missing (number of 
usable values) 


mean of nonmissing values 

standard deviation of nonmissing values 

smallest among nonmissing values (minimum value) 
largest among nonmissing values (maximum value) 
sum of nonmissing values 

coefficient b of the regression line y = a + bx where x’s 
are equally spaced 

coefficient b of the regression line y = a + bx 


area under the values of (xi,yi), where x’s are assumed 
to be equally spaced 

area under y by the trapezoidal rule 

the index (using integers 1, 2, 3, ...) when number 
matches a value in varf through varp, respectively; 0 
otherwise 

the index (using integers 1, 2, 3, ...) of var1$, var2$, ... 
when charval matches the value of the respective vari- 
able; 0 otherwise 

a 1 (true) if number matches a value in уаг1, var2, .... 
or varn; 0 (false) otherwise 

a 1 (true) when charval matches a value in var1$, 
var2$, ..., or varn$; 0 (false) otherwise 


W Arguments are not restricted to variable names—they can be explicit values or 
other functions—and they can contain arithmetic operators (for example, 


(INDEX-1)). 


m Either commas or spaces can be used to separate arguments, unless an ambiguity 
arises. For example, when using AVG to complete the mean: 
AVG(10, 313-29, 236, 19) differs from AVG(10 313 —29 236 19). For the first 
result, four numbers are averaged (the second is 313—29). For the second result, 
five numbers are used (the third is —29). 


W Use a double period (..) as a shortcut notation to shorten a list of contiguous 
variables—that is, for Q1, Q2, ..., Q20, use Q1 .. Q20. 
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Functions and Operators 


The functions described below operate on cases of a variable. With these functions you 


can compute a value for specified cases. 


Function 
CMIN(/variable, initial case, end case, step) 


CMAX(variable, initial case, end case, step) 


CSUM(variable, initial case, end case, step) 
CPROD(variable, initial case, end case, step) 
CRANGE (variable, initial case, end case, step) 


CMEAN (variable, initial case, end case, step) 
CVAR(Vvariable, initial case, end case, step) 
CSTD(variable, initial case, end case, step) 


Initial case and step are '1' by default. 


Result 
Smallest among non-missing values 
(minimum value). 


Largest among non-missing values 
(maximum value). 


Sum of non-missing values. 
Product of non-missing values. 


Range of non-missing values (maximum 
value - minimum value). 


Mean of non-missing values. 
Variance of non-missing values. 


Standard deviation of non-missing val- 
ues. 


Default end case is last case of the file in use. 


Argument may not be just a variable name - it can be a mathematical function, 


containing variable as an argument, and it can contain arithmetic operators. Here is 


п 
an example of a data file: 
x Y Z 
1 6 11 
2 7 12 
3 8 13 
4 9 14 
5 10 15 
EXP(CSUM(Y*(X+Z))) 


Multiplies the sum of X and Z, with Y for each case and adds up the evaluated values. 
The exponential of the resultant is 4.308817E+286. 
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Defining group codes and intervals 


In this example: 


LET JOB_CODE = COD(JOB,35,62,57,28) 
LET AGE GRP = CUT(AGE,12,19,65) 


the COD function uses values stored in JOB to create JOB CODE—the value 35 
becomes 1 in JOB CODE, 62 becomes 2, 57 becomes 3, and 28 becomes 4. The CUT 
function uses values stored in AGE to create four codes in AGE GRP—cases with 
AGE < 12 are coded 1 in AGE GROUP, cases with 13 < AGE < 19 are coded 2, cases 
with 20 < AGE < 65 are coded 3, and cases with AGE > 65 are coded 4. 


Function 
COD(var,num1,num2,...) 


COD\(var$, 'text1','text2 ',...) 
INC(var,num1,num2, ...) 
INC(var$, text1','text2 ',...) 


CUT(var,num1,num2,...) 


CUT(var$, 'text1','text2 ',...) 


LAB$(var) 


NCAT(var or var$,0 or 1) 


Result 


values of var are replaced—num! becomes 1, num2 becomes 
2, etc. 


values of var$ are replaced—text1 becomes 1, text2 becomes 
2, etc. 


a | (true) if the value in var matches one of the numbers пит1, 
num2, ...; 0 (false) otherwise 


a | (true) if the value in var$ matches a value in text1, text2, 
...) 0 (false) otherwise 


defines intervals along a continuous variable. Values in var less 
than or equal to пит? get value 1. Values greater than num 
and less than or equal to num2 get value 2, etc. 


groups variables alphabetically. Values in var$ less than text 
(alphabetically), including text7, get value 1. Values between 
text1 and text2, including text2, get value 2, etc. 


yields a variable that contains the values generated by the 
LABEL command 

yields number of categories of var or var$. The result includes 
or does not include missing value as a separate category 
according as '1' or '0' is chosen for second argument. 
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Date and time functions 


Function Action 
NOWS(‘format') the current time and/or date in format! you specify. 
VAL(var$, format ', field) extracts numbers from string dates or times in var$. Format! 


is the format of the dates or times in var. field is the number 
of the format element you want to extract (for example, for 
format MM/DD/YY, I corresponds to month, 2 to day, and 3 


to year). 

STR$(var, format ") writes numeric dates or time values stored in var as charac- 
ters using format 

DOC(var$, format ') returns day-of-century from a string variable containing a 
date in the specified format 

DOC(yvar,mvar,dvar) returns day-of-century from year, month, and day variables 
(specify the arguments in this order). 

DOWS(doc var) returns day-of-week (Monday, Tuesday, ...) from numeric 
day-of-century doc var 

DAT(n, format) returns day or time from a numeric day-of-the-century n in 


specified format of either Y (year), M (month), D (day), h 
(hour), m (minute), or s (second)—specify only one. 


+ In these date and time functions, SYSTAT uses the following built-in symbols: 


Dday ssecond .decimal point j 
Mmonth mminute 4a number before or after a decimal 
Yyear hhour 


SYSTAT uses the smallest unit in format to interpret the value. Note that the number of m's 
indicates how month is reported: 


MM 1 through 12 
MMM jan, feb, mar, apr, may, ..., dec. 
MMMMMMMM january, february, march, april, may, ..., december 
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Character functions 
Function Action 
Changing case 
UPR&$(var$) changes the values of var$ to all capitals 
LOWS(var$) changes the values of var$ to all lower case 
CAP$(var$) capitalizes the first letter of each var$ value. Makes the 


Justifying character values 
CNT$(var$, n) 


CNTS(text ') 
RGT$(var$, n) 


RGT$(text ') 
LFT$(var$, n) 


LFT$(‘text ') 


remaining letters of each value lower case. 


centers values of variable var$, where n is the width of the 
character field 


centers fext in a 72-character field 


right-justifies values of variable var$, where n is the width 
of the character field 


right-justifies text in a 72-character field 


left-justifies values of variable var$, where n is the width of 
the character field 


left-justifies text in a 72-character field 


Deleting blanks or other characters 


SQZ$(var$) 
SQZS(var$, text ') 


Locating a character in a string 


IND(var$, ‘char ') 


removes all imbedded blanks 
removes fext from the values of var$ 


the position of the first occurrence of char in each value 
of var$ 


Finding number of characters in a string 


LEN(var$) 


Extracting a case in a variable 


DATA(var,n) 
DATAS(var$,n) 


yields number of charcaters in each case of var$. 


value of var at n' case 
character value of var$ at n™ case 


Converting numbers to characters and vice versa 


VAL(var$) 


STR$(var) 


converts numbers stored in the string variable var$ to 
numeric values that can be used in calculations 


converts numeric values of var to string values 


Extracting and inserting characters 


MIDS(var$,p,j) 


SUBS$(var$, text1','text2 ') 
PUTS(vars, ‘text 'p.j) 


RPD$(var$, ‘char ) 
LPDS$(var$, ‘char ') 


extracts a string of j characters from each value of var$, 
beginning with the pth character 


replaces text? with text2 


beginning at the pth character of var$, replaces the next j 
characters with the first j characters of text 


right-pads the values of var$ with char 
left-pads the values of var$ with char 
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Function Action 

Concatenating character strings 

CATS(var18,var2$) joins contents of var1$ with var2$, eliminating trailing 
blanks 

ASCII codes 

ICH('char ') ASCII code corresponding to the character char 

ICH(var$) ASCII code corresponding to the value in var$ 

ASCS(int) ASCII character corresponding to the integer int 

ASC$(var) ASCII character corresponding to the value in var 

Using Soundex to code names 

SND§$(var$) an alphanumeric SOUNDEX code of var$ 

Creating group labels 

LAB$(var) yields a variable that contains the values generated by the 


LABEL command 
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Functions Relating to Probability Distributions 


Distribution Cumulative Density Inverse Random data 
Uniform (0,1) UCF (x,/ow,hi) UDF (x,low,hi) UIF (a,/ow,hi) URN (/ow, hi) 
Normal (0,1) ТСЕ (z,loc,sc) ZDF (z,loc,sc) ZIF (a,loc,sc) ZRN (/oc,sc) 
t ТСЕ (t,df) TDF (t,df) TIF (a,df) TRN (df) 
F FCF (F,df1,df2) ЕРЕ (F,df1,df2) FIF (a,df1,df2) FRN (df1,df2) 
Chi-square XCF (x ,df) XDF (Хг, XIF (a,df) XRN (df) 
Gamma GCF (x,shp,sc) GDF (x,shp,sc) GIF (a,shp,sc) GRN (shp,sc) 
Beta ВСЕ (x,shp1,shp2) BDF (x,shp1,shp2) BIF (a,shp1,shp2) BRN (shp1,shp2) 
Exponential (0,1) ECF (x,loc,sc) EDF (x,loc,sc) EIF (a,loc,sc) ERN (/oc,sc) 
Logistic (0,1) LCF (x//oc,sc) LDF (x,loc,sc) LIF (a,/oc,sc) LRN (/oc,sc) 
Studentized SCF (x,k,df) SDF (x,k,df) SIF (a,k,df) SRN (k,df) 
Weibull WCE (x,sc,shp)  WDF(xsc,shp) WIF (a,sc,shp) WRN (sc,shp) 
Cauchy CCF(x,/oc, sc) CDF(x,loc,sc) CIF(a,loc,sc) CRN(/oc, sc) 
Double exponen- DECF(x,/oc,sc) ^DEDF(x/oc,sc)  DElF(a/oc,sc) | DERN(/oc,sc) 
tial (Laplace) 
Gompertz GOCF(x,b,c) GODF(x,b,c) GOIF(a,b,c) GORN(b,c) 
Gumbel GUCF(x,loc,sc) | GUDF(x/oc,sc) GUIF(a,loc,sc) GURN(/oc,sc) 
(wad Gaussian IGCF(x,/oc,sc) IGDF(x,/oc,sc) IGIF(a,/oc,sc) IGRN(/oc, sc) 
а 
Logit normal ENCF(x,loc,sc) | ENDF(x,/oc,sc)  ENIF(a/oc,sc) © ENRN(/oc,sc) 
Lognormal LNCF(x/oc.sc)  LNDF(x/oc,sc)  LNIF(a/oc,sc) | LNRN(/oc,sc) 
Pareto PACF(x,thr,shp) PADF(x,thr,shp) PAlIF(a.thr,shp PARN(thr,shp) 
Rayleigh RCF(x,sc) RDF(x,sc) RIF(a,sc) RRN(sc) 
Triangular TRCF(x,a,b,c) TRDF(x,a,b,c) TRIN(a,a,b,c) TRRN(a,b,c) 
Loglogisitc LOCF(x,logsc,shp) LODF(x,logsc,shp)LOIF(á,logsc,shp) LORN(/ogsc, Shp) 
Erlang ERCF(x,shp,sc) | ERDF(x,shp,sc) ERIF(á,shp,sc) © ERRN(shp,sc) 
Non-central Chi- NXCF(x,df,a) NXDF(x,df,à) NXIF(á,df,á) NXRN(df,á) 


square 
Non-central F 
Non-centralt — NTCF(x,df,á) 


Smallest extreme SECF(x loc, sc) 
value 


NTDF(x,df,á) 
SEDF(x.loc,sc) 


NTIF(á,df, а) 
SEIF(á,loc, sc) 


NFCF(x,df1,df2,4) NFDF(x,df1,df2,á) NFIF(á,df1,df2,8) NFRN(df1,df2,4) 


NTRN(df,á) 
SERN(/oc,sc) 
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Studentized max- SMCF(x,k, df) 
imum modulus 


Binomial NCF (x,n,p) 
Poisson PCF (x,/) 

Discrete uniform DUCF(x,N) 
Geometric GECF(x,p) 


Hypergeometric HCF(x,N,m,n) 
Negative bino- NBCF(x.k,p) 
mial 

Benford's Law BLCF(x,B) 
Logarithmic LSCF(x, theta) 
series 


Zipf ZICF(x,shp) 


SMDF(x,k,df) 


NDF (х,л,р) 
PDF(x,/) 
DUDF(x,N) 
GEDF(x,p) 
HDF(x,N,m,n) 
NBDF(x,k,p) 


BLDF(x,B) 
LSDF(x,theta) 


ZIDF(x,shp) 


Functions and Operators 


SMIF(a,k, df) 


NIF (a,n,p) 
PIF (a,l) 
DUIF(a,N) 
GEIF(a,p) 
HIF(a,N,m,n) 
NBIF(a,k,p) 


BLIF(á,B) 
LSIF(á, theta) 


ZIIF(a,shp) 


SMRN(K,df) 


NRN (n,p) 
PRN (/) 
DURN(N) 
GERN(p) 
HRN(N,m,n) 
NBRN(k.p) 


BLRN(B) 
LSRN(theta) 


ZIRN(shp) 


where low is the smallest value and hi, the largest value; loc is the location parameter 


and sc, the scale parameter; sh 


and finally, df is the degrees of freedom. If low, hi, loc, or sc is omitted, the default 


values, which are displayed in the distribution column, are assumed. s 


p is the shape parameter and thr, the threshold parameter; 


Section 


2 
Graphics 


This section contains descriptions of the uses of various types of graphical displays. 
It also lists the definitions of the commands for each of SYSTAT’s graphical 
procedures and their options. At the end of each procedure, the common options 
available for the display are listed. 

Global features are listed after the graphical procedures. These features are 
switches that affect the appearance of all subsequent graphs until you reset them. 

Finally, common options (local options) that apply to the current display only are 
listed. These options are available for most of SYSTAT’s graphs, so they are grouped 
together rather than repeated for each display. 
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Graphical Displays 


In this section, we describe SYSTAT’s graphical displays, indicating whether they are 
useful for displaying counts or describing distributions of continuous variables, or 
simply helpful for unraveling relations among variables. Later, when we define 
features and options for each display, we list the displays alphabetically. 


Univariate displays 


These displays allow you to examine each variable individually to get an idea of typical 
values, the spread of the values, and the shape of the distribution. In addition, you can 
screen for outliers and recording errors, lack of normality, and subpopulations. 


Bar, Dot, Line, Pie, Profile, and Pyramid charts 


SYSTAT offers six univariate graphical displays that are useful for characterizing the 
values of a single variable for the complete sample or for subpopulations defined by 
one or two grouping variables. 


Pie 


For each category of a variable or cross-classification of two grouping variables: 
ш BAR displays a bar 


m DOT displays a dot (or other plot symbol) where the top of the bar would be 
represented with a dot. 


W LINE displays a line where the dots or tops of bars would be connected to form a 
line. 


PROFILE fills in the area under the line 
PYRAMID draws pyramids instead of bars 
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m PIE (pie chart) displays the proportion of counts or measures (described below) 
falling within the category 


The height of each bar, dot, line, etc., represents: 


= The count in that category. You specify one or two numeric or string variables with 
a few distinct categories and SYSTAT counts the number of cases in each category 
(or cross-classification). 


m The mean of the values in that category. You specify a quantitative variable and 
one or two stratifying variables with numeric or string codes. SYSTAT averages the 
values of the quantitative variable for each category of one stratifying variable (or 
the cross-classification of two variables). 


m The percentage of cases in that category. You select the PERCENT option. If you 
specify a categorical variable only, SYSTAT tallies the number of cases that have 
each unique value, sums the tallies, and computes the percentage each tally is of 
the total. If you specify a quantitative variable and a stratifying variable, SYSTAT 
computes the average value of the quantitative variable within each category, sums 
these means, and determines the percentage each mean is of the total. 


= A measurement or statistic input for the category. You input a record for each 
category with the statistic (for example, total sales for a region, maximum side 
effect score for a treatment) and the value of the stratifying variable. 


Alternative displays include multivariable bar charts, percentage bar charts, range bar 
charts, stacked bar charts, divided bar charts, anchored bar charts, star plots, and 
attention maps. 

When the grouping variable has two levels, a dual display is available. Real and 
pseudo 3-D graphs can be requested with all displays except PIE. (PIE does offer a 
pseudo 3-D display, however.) You can interactively rotate the 3-D displays. 

You can include several variables in one display with their respective bars laid out 
side by side (or stacked on top of one another) within each category, or, for repeated 
measures designs, the bars for all categories for the first variable positioned before 
those for subsequent variables. For either structure (bars grouped within categories or 
bars of categories grouped within variables), results can be stratified—that is, 
displayed in separate frames. The data for these displays can be in cases-by-variables 
form (multiple cases per category) or aggregated by category with the count, mean, or 
other measure and category identifier. 
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Histograms, box plots, and density displays 


The density of a sample is the relative concentration of data points in intervals across 
the range of the distribution. A histogram is one way to display the density of a 
quantitative variable; box-and-whisker displays, dit or dot plots, frequency polygons, 
fuzzygrams, jitter plots, stripe plots, and histograms with data-driven bar widths are 
others. 

A histogram is the most familiar one among these displays. The word comes from 
a Greek word (histos) for a straight standing beam, like a mast or loom frame, and a 
word (gram) for a drawn picture. Thus, a histogram is a pictorial display of vertically 
standing bars. It is a crude density estimator because the shape of a histogram depends 
upon the choice of the number of bars. Most other graphical density estimation 
methods depend on subjective choices of parameters (or settings) as well, which is one 
reason the general field of density estimation is rather controversial (Wegman, 1982). 

SYSTAT can use the sample mean and standard deviation to construct a normal 
density curve (or cumulative normal distribution curve) for comparison against the 
actual anomalies of the sample distribution. A nonparametric kernel density 
estimator is also available for density and distribution curves. 

Rather than comparing sample values to the normal distribution (mean, standard 
deviation, etc.), box plots show robust statistics (median, quartiles, etc.). Some 
complain that box plots or the choice of intervals for bars in a histogram can mask gaps 
or separations in the distribution. Dot and dit plots answer this problem because they 
display every value in the sample. It is often useful to examine both a box plot and a 
dit display. A gap display is another alternative. Its bar widths vary across the range of 
the distribution —when there are gaps, the neighboring bar is made wider to include the 
gap. 
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Fuzzygrams superimpose a probability distribution on each bar of a histogram. 
Bars for histograms based on small samples are fuzzier than bars for large sample 
histograms. Jittered density plots place points along a horizontal data scale at the 
exact locations of data values. To keep points from colliding, they are jittered 
randomly on a short vertical axis. These displays work better for large samples than 
small samples. Density stripes are vertical lines placed at the location of data values 
along a horizontal data scale and look like supermarket bar codes. For large samples, 
the stripes tend to collide, so you should consider a jitter plot instead. 


Jitter Stripe 


Fuzzy 
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These displays can be stratified across the levels of a grouping variable, or if the 
grouping variable has only two values, a dual (or back-to-back) version is available. 
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Quantile and probability plots 


PPLOT QPLOT 


Quantile plots and probability plots are also useful for studying the distribution of a 
variable. QPLOT produces quantile plots (Q plots). Unlike probability plots, which 
compare a sample to a theoretical probability distribution, a quantile plot compares a 
sample to its own quantiles (a one-sample plot) or to another sample (a two-sample or 
Q-Q plot). The quantile of a sample is the data point corresponding to a given fraction 
of the data. 

PPLOT plots the values of a variable against the corresponding percentage points of 
a theoretical normal, half-normal, chi-square, uniform, exponential, gamma, or 
Weibull distribution. Graphs like this are called probability plots or P-plots. With 
PPLOT, you can even construct detrended probability plots, like the lower plot on 


the left. You can also plot the expected values of a variable against the expected values 
of another variable (P-P plot). 
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Bivariate and multivariate displays 


Scatter Plot 
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The scatterplots and other displays here are useful for discovering and describing 
relationships among variables, revealing unusual values, and highlighting differences 
among groups. 

PLOT produces bivariate scatterplots, 3-D scatterplots, and other plots of continuous 
variables against each other (or a categorical variable). A wealth of options allows you 
to incorporate confidence ellipses for the sample or centroid, kernel density estimators, 
convex hulls, Voronoi tessellations, Delaunay triangulations, minimal spanning trees, 
the traveling salesman algorithm for the shortest path connecting points, or a vector 
option that connects each point to another specific point. You can also request a bubble 
plot, an influence plot, or a high-low-close plot. 

You can use the graph tools to dynamically rotate 3-D plots and fine-tune displays. 
You can also change the plot shape, range limits for the axes, values of a tension 
parameter for a smoother, and so on. 

You can add a regression line with an optional confidence band or one of 19 
smoothers (10 for 3-D displays). Another option generates residuals automatically. 
Use it in conjunction with the graph tools to select an appropriate data transformation. 

You can plot subpopulations in separate frames or overlay them in a single display. 
When groups are specified, features such as smoothers, ellipses, and hulls are 
displayed separately for each group. You can specify symbols to identify group 
membership or label each case (point) with a unique name (up to 12 characters). 

Options are available to plot in polar, cylindrical, spherical, and triangular 
coordinates. 
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Multivariate displays 


In this section, we introduce displays useful for describing values of three or more 
variables: Andrews’ Fourier plots, parallel coordinate plots, scatterplot matrices 
(SPLOMs), icons, and multiplots. 


Fouries Parallel SPLOM 
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FOURIER constructs a waveform made up of sine and cosine components—with a 
different sine or cosine component for each variable you include. Cases with similar 
values have waveforms with similar shapes, making it easy to recognize distinct 
subgroups of cases. 

PARALLEL draws an axis for each variable and positions them side by side so they 
are parallel. The same scale is used for each axis, The values for a case are plotted on 
the axes and connected with a line segment. The patterns of the connecting lines vary 
by subgroup. 

SPLOM positions a bivariate scatterplot for each specified pair of variables in a row 
and column display. A common vertical scale is used for all plots within a row, and a 
common horizontal scale is used within a column. Thus, the display is a matrix of 
scatterplots, each one corresponding to an entry in a correlation matrix for the 
variables. 

ICON represents the value of multiple variables as cartoon faces, Fourier blobs, 
stars, histograms, and other shapes. For some data, you will be able to identify groups 
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of icons (cases) that elude automated clustering methods because your eye can 
perceive nonlinear, disjunctive relationships. 

The MULTIPLOT option creates two-dimensional tables of a particular graph type. 
Tabular categories appear at the top and left of the table, and scale values appear at the 
bottom and right. You can apply the multiplot layout to scatterplots, probability and 
quantile plots, bar charts, dot charts, line charts, profile charts, and pyramid charts. 


Graph tools 


After a plot appears on the screen, you can fine-tune and customize your display, print 
or save it, and rotate 3-D displays. Post-creation plot changes include colors, symbols, 
axes, titles, size, and location. Use the Graph Properties dialog box to request a power 
transformation of each variable and to specify tension for a smoother or set the number 
of bars in a histogram. For more detailed information, see Editing Graphs chapter in 
the Graphics manual of SYSTAT. 
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BAR: Bar charts 


For each category of a variable (or cross-classification of two stratifying variables), a 
bar chart displays a bar. Using the cases in each category, the height of its bar can be 
the count in that category, the mean or median of a quantitative variable for the cases, 
the percentage of cases in that category, or a statistic or measurement input for the 
category. 

Alternative bar chart displays include multivariable bar charts organized by groups 
or repeated measures, percentage bar charts, range bar charts, stacked bar charts, 
divided bar charts, and anchored bar charts. Bar charts can be produced in 2-D or 3-D 
and can be dynamically rotated on the screen. 


BAR  x-varlist 
.* y-var * x-var 
y-varlist * x-var 
z-varlist * y-var * x-var 
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The following options are available: 


| MEDIAN displays the median for each variable in y-varlist (for a 2-D plot) or z-varlist 

(for a 3-D plot). 

PERCENT displays each count, mean, or median as the percentage of that group's 
contribution to the total. You may want to use STACK with this option. 

STACK stacks bars of a multivariable bar chart instead of displaying them side by side. 
Can be used with both 2-D and 3-D plots. 

RANGE plots the range of two variables in y-varlist against the x-var. 

BASE=n anchors bars at п. 

POLAR produces a bar chart in polar coordinates. 

TILE produces a colored (or filled if SURFACE=FILL) 2-D mosaic tiling of 3-D 
bars. 


BTHICK=n controls the amount of space between tick marks filled by the bars. Specify 
0 <= n <= 1. Default value is 0.5. 


Examples: 


BAR SEX$ EDUCATN Two bar charts of counts arranged side by side. One bar chart 
displays the total number of males and females, and 
the second chart displays the total number of people in 
each education level. 
BAR . * EDUCATN * SEX$ A 3-D display where the height of each bar represents 
the total number of males and females in each level 
of education. 
BAR INCOME * EDUCATN The bars represent average income for each level of 
education. 


BAR INCOME * EDUCATN * SEX$ A 3-D display representing average income for males 


and females in each level of education. 


Common Options: 


ACOLOR ALTITUDE AXES COL COLOR CSIZE 
DIRECTION DUAL ERROR ETHICK ETYPE FCOLOR 
FILL FTITLE GROUP HEIGHT LABEL LEGEND 
LLABEL LOC LTITLE MATRIX MULTIPLOT OVERLAY 
PROJECT REPEAT OW SCALE SERROR SLOPE 


R 
SPHERE STICK THREED TICK TITLE TRANSPOSE 
WIDTH XIYIZFORMAT X/Y/ZGRID XIYIZLABEL XY/ZLIMIT — X/Y/ZLOG 
XIYIZMAX — XIYIZMIN Т2РІР XIYIZIPON  XIYIZREV XIYIZTICK 
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DENSITY: 
Density charts 


A histogram is one way to display the density of a quantitative variable; box-and- 
whisker displays, dit or dot plots, frequency polygons, fuzzygrams, jitter plots, stripe 
plots, and histograms with data-driven bar widths are others. The options and features 
to produce these displays are described below. 


DENSITY  x-varlist 
.* y-var * x-var 
y-varlist * x-var 


The following options are available: 


/ HIST histogram (DENSITY command with HIST options can be replaced by HIST 
command) 

BOX box-and-whisker display. Use NOTCH to insert notches that mark confidence 
intervals (DENSITY command with BOX options can be replaced by BOX 
command) 

DOX box plot combined with a dot plot 

POLY frequency polygon 

CUM use with HIST, GAP, POLY, FUZZY, NORMAL, or KERNEL to display 


cumulative densities. 


NORMAL normal curve computed using the sample mean and standard deviation. (Bars 
are omitted.) 


KERNEL nonparametric kernel density estimator. TENSION controls the stiffness of 
the KERNEL smooth. A higher value of n uses more data points to smooth 
each value and makes the smooth stiffer. A lower value of n makes the 
smooth looser and more susceptible to the influence of individual points. 
Specify 0 < n <= 1. The Default is 0.5. 


DIT dit plot 
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DOT symmetric dot plot 
GAP histogram with data-driven bar width 
FUZZY fuzzygram 
STRIPE density stripe 
JITTER jittered density graph 
BARS=n number of bars to use in a histogram 
BWIDTH=n width of bars in a histogram 
TILE produces a colored (or filled if SURFACE=FILL) 2-D mosaic tiling of a 3-D 
graph 
CONTOUR produces 2-D contour lines of 3-D KERNEL or NORMAL bivariate density 
estimators. Use ZTICK to control the number of solid contour lines (at major 
tick marks) and ZPIP to control the number of dashed contour lines (at minor 
tick marks). Use CUT (default CUT=30) to control the resolution of con- 
tours. 
POLAR produces a density display in polar coordinates 
Examples: 
DENSITY INCOME a (HISTOGRAM) density display of income. 


density of death rate by birth rate is displayed in 


DENSITY . * DEATH. RT * BIRTH RT 4 


displays two box plots of income—one for females 


DENSITY INCOME * SEX$ / BOX and one for males. 


Common Options: 


SPHERE 
TRANSPOSE 
XIYLOG 
XIYIZTICK 


ALTITUDE AXES COL COLOR CUT 

DUAL FCOLOR FILL fTITLE GROUP 

LEGEND LLABEL LOC LTITLE OVERLAY 

REPEAT SCALE SIZE SLOPE 

STICK SURFACE SYMBOL TICK TITLE 

WIDTH XIYIZFORMAT X/Y/ZGRID X/Y/ZLABEL X/Y/ZLIMIT 
IZMAX X/YIZMIN XIYIZPIP XIYPOW XIYIZREV 
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DOT: Dotplots 
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DOT plots display plot symbols to show counts in categories or to show means or 
medians of continuous variables within categories. Imagine putting dots at the tops of 
the bars in a bar chart and erasing the bars. 


DOT x-varlist 
.* y-var * x-var 
y-varlist * x-var 


z-varlist * y-var * x-var 


The following options are available: 


/ MEDIAN displays the median for each variable in y-varlist (for a 2-D plot) or z-varlist 


(for a 3-D plot). 


PERCENT displays each count, mean, or median as the percentage of that group's 
contribution to the total. 


POLAR produces a dot plot in polar coordinates 


LINE connects dot plot symbols with a line, in left-to-right order 
Examples: 
DOT SEX$ EDUCATN 


DOT . * EDUCATN * SEX$ 


DOT INCOME * EDUCATN 


Two dot charts of counts arranged side by side. One dot 
chart displays the total number of males and females, 
and the second chart displays the total number of people 
in each education level. 


A 3-D display where the height of each dot represents 
the total number of males and females in each level of 
education. 


The dots represent average income for each level of 
education. 


DOT INCOME * EDUCATN * SEX$ A 3-D display representing average income for males 


and females in each level of education. 
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Common Options: 


ACOLOR ALTITUDE AXES COL COLOR DASH 
DIRECTION DUAL ERROR ETHICK ETYPE FCOLOR 
FILL FTITLE GROUP HEIGHT LEGEND LLABEL 
LOC LTITLE MATRIX MULTIPLOT OVERLAY PROJECT 
REPEAT ROW SCALE SERROR SIZE SLOPE 
SPHERE STICK SYMBOL THREED TICK TITLE 


TRANSPOSE WIDTH XIYIZFORMAT X/YIZGRID  XIYZLABEL X/Y/ZLIMIT 
RA XIYIZMAX X/Y/ZMIN XIYIZPIP — XIYIZPOW  XIYIZREV 
ZTICK 
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DRAW: 
Drawing objects 


The DRAW command draws objects. 


DRAW argument 


Use argument to specify an object to draw on a graphical display. The arguments are 
shown grouped with their respective options. 


The following arguments and options are available: 


Argument Options 


ARROW 
| FROM=n1, n2 
TO=n3, n4 
LINE 
| FROM=n1, n2 
TO=n3, n4 
BOX 
| LOC=n1, n2 
CENTER 
CIRCLE 
/ LOC2n1,n2 
CENTER 
SYMBOL 


Description 


draws an arrow 


FROM and TO specify the starting (n1, n2) and ending 
(n3, n4) coordinates of the arrow relative to the origin. 
The unit can be inches (IN) or centimeters (CM); if you 
don't specify a unit, IN is used. The default value for 
each coordinate is 0. 


draws a line segment 


FROM and TO specify the starting (n1, n2) and ending 
(n3, n4) coordinates of the line relative to the origin. 
The unit can be inches (IN) or centimeters (CM); if you 
don't specify a unit, IN is used. The default value for 
each coordinate is 0. 

draws a box 


LOC specifies the coordinates (n1, n2) for the lower 
left (bottom) corner of the box. The unit can be inches 
(IN) or centimeters (CM); if you don't specify a unit, IN 
is used. CENTER centers the box on x and y. The unit 
can be IN or CM. If you don't specify a unit, IN is used. 


draws an ellipse 


LOC specifies the coordinates (n1,n2) for the lower 
left (bottom) corner of a hypothetical box enclosing the 
ellipse. To center the ellipse at (n1,n2), use the CEN- 
TER option. The unit can be inches (IN) or centimeters 
(CM); if you don't specify a unit, IN is used. Specify 
the size of the ellipse using the HEIGHT and WIDTH 
options. To draw a circle, use identical values for 
HEIGHT and WIDTH. 


draws the specified symbol 
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| SYMBOL=n 
LOC=n1, n2 
TRIANGLE 
| LOC=n1, n2 
CENTER 
Examples: 


DRAW: Drawing objects 


use the SYMBOL option to specify which of SYSTAT's 
23 symbols to draw. The symbol is centered on the 
point of the coordinates (n1, n2) specified with LOC. 
The unit can be inches (IN) or centimeters (CM); if you 
don't specify a unit, IN is used. Specify the symbol's 
height and width with the SIZE option. 


draws a triangle 


LOC specifies the coordinates (n1, n2) for the lower 
left (bottom) corner of the triangle. The unit can be 
inches (IN) or centimeters (CM); if you don't specify a 
unit, IN is used. CENTER centers the triangle on (n1, 


n2). 


DRAW ARROW / FROM-OIN,2IN TO-3IN,4IN 
DRAW LINE / FROM-OIN,2IN TO=3IN,4IN DASH=11 
DRAW ВОХ / HEI-2IN WID=2IN FILL-5 

DRAW CIRCLE / LOC-2IN,2IN HEI=3IN WID=3IN 


DRAW SYMBOL / 5 
H 


YMBOL-10 FILL=5 LOC-2IN,2IN, 
EI-3IN WID=3IN 


DRAW TRIANGLE / HEI=2IN WID=2IN 


Common Options: 


COLOR DASH 


SIZE SYMBOL WIDTH 


HEIGHT LOC 
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FOURIER: Andrews '"Fourierplots 


FOURIER: 


Andrews'Fourierplots 


Degrees 


Plot Andrews’ Fourier components (one waveform for each case). Use to display 
information for several variables in two dimensions. 


FOURIER waveform-varlist 

The following option is available: 

/ POLAR produces a Fourier plot in polar coordinates 
Example: 


produces an Andrews’ Fourier plot with one waveform per 
FOURIER CARBO FAT PROTEIN case, which is a trigonometric combination of the selected 


variables: CARBO, FAT, and PROTEIN. 


Common Options: 
ACOLOR AXES COL COLOR CSIZE DASH 
FCOLOR FILL FTITLE GROUP HEIGHT LABEL 
LEGEND LLABEL LOC LTITLE OVERLAY ROW 
SCALE SL STICK TITLE TRANS WIDTH 


OPE 
X/YFORMAT X/YGRID X/YLABEL — X/YLIMIT X/YPIP X/YREV 
YTICK 


81 


FPLOT: Function Plots 


FPLOT: Function Plots 


FPLOT produces function plots in two and three dimensions. You type the equation for 
the function—no data values are used. For example, for a 2-D plot, type 


Y = 3 + Х/2 


and for a 3-D plot, type 
2=2*Х + 3Y 


When typing an equation for FPLOT, make sure that none of the variables іп the 
equation are names of variables in the current data file. 


Note: Use a semicolon (;) instead of a slash (/) to separate the options from the 
equation. 


FPLOT equation 


Plots the mathematical function defined by the specified equation. Be sure to use a 
semicolon (;) to separate the option list from the equation. Names of variables should 


not be variable names in the current data file. 


; SPHERE plots a spherical function 

POLAR produces a polar coordinate plot. The y scale is the distance of a point from 
the origin and the x scale is the angle between the horizontal axis and a line 
from the origin to the point. The z scale is the same as in rectangular coor- 
dinates. Use with AXES and SCALE-NONE, 1, or 2 to choose the axes (or 
scales) to be drawn. NONE draws no axes, | draws the circular theta axis, 
and 2 draws both the r and theta axes. 

TRI plots a triangular coordinate contour plot (uses four variables) 


CONTOUR produces 2-D contour plot with lines. Use with ZTICK to specify the num- 
ber of contour lines (uses three variables). The default is a 30 x 30 grid. 


TILE produces a filled 2-D contour plot with a 30 x 30 grid 
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FPLOT: Function Plots 


Example: 


FPLOT z-x^y*(x^2-y^2) / (х^2+у^2) ; XMIN=-10 XMAX=10 YMIN=-10 YMAX=10, 
ZMIN=-100 XMAX=100 


Common Options: 


ACOLOR 


ALTITUDE AXES 
FCOLOR HEIGHT 
SLOPE STICK 
TRANSPOSE WIDTH 
X/Y/ZLIMIT X/Y/ZIWLOG 


X/YIZWPOW _ X/Y/Z/WREV 


COLOR CUT 
LEGEND LOC 
SURFACE TICK 
XIYIZINFORMAT X/Y/ZGRID 
X/Y/ZIWMAX X/Y/Z/WMIN 
X/YIZIWTICK 
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ICON: Icon plots 


ICON: Icon plots 


ik E EF iF 
‘Lik ЁЁ 


kb: EP F 
iÈ E E ib IP IF 
ФЕ Е ОІВ Р 


Icons аге pictures for displaying multivariate data. Given a data set containing 
measurements of n cases on p variables, you plot n icons (one for each case) with p 
different features in each icon. 

SYSTAT offers a variety of icons for representing multivariate data: cartoon faces, 
Fourier blobs, stars, histograms, rectangles, and others. You should try several of these 
on the same data to see how they work. You should also compare them to automated 
techniques such as discriminant analysis and clustering. For some data, you will be 
able to locate clusters that elude automated methods because your eye can perceive 
nonlinear, disjunctive relationships. Icons cannot replace formal statistical models, but 
they are indispensable exploratory tools. 

When using icons, you should make sure that your variables are on similar scales. 
Otherwise, one bar in all the icons would be tall and the rest barely visible. If the 
variables are on different scales, you can transform them to z scores using the SD 
option for STAND or to a 0,1 scale using the RANGE option. 
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ICON: Icon plots 


ICON feature-varlist 


The following options are available: 


/ ARROW 
BLOB 
FACE 
HIST 
PROFILE 
THERM 
STAR 
SUN 
VANE 
ILOC-var1,var2 


PROJECT = 
GNOMON 
STEREO 
MERCATOR 
ORTHO 
LAMBERT 
ROBINSON 
SINUSOIDAL 
MILLER 
PETERS 
FISHEYE 

POLAR 


Example: 


arrow icons 

Fourier blob icons 

Chernoff's faces 

histogram icons 

profile icons 

framed thermometer-shaped icons 

star icons 

factorial sun plots based on first two principal components 
weathervane icons 


centers the icons on the x and y coordinates specified by var and 
var2. This is useful when placing icons on a map. 


longitude and latitude mapping projections: 
oblique gnomonic 

oblique stereographic 

Mercator conformal 

oblique orthographic 

Lambert equal area cylindrical 

Robinson 

sinusoidal 

Miller cylindrical 

Peters 

fisheye 

produces an icon plot in polar coordinates 


ICON EDUCATN HEALTH MILITARYproduces an icon plot with one icon for each case. Each 


icon has three different features which represent the three 
variables: EDUCATN, HEALTH, and MILITARY. 
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Common Options: 


ICON: Icon plots 


COLOR 
GROUP 
LTITLE 
TRANSPOSE 
XPOW 


CSIZE 
HEIGHT 
OVERLAY 
WIDTH 


DASH FCOLOR FILL 
LABEL LEGEND LLABEL 
PROJECT ROW SIZE 
X/YMAX X/YMIN X/YREV 
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LINE: Line charts 


LINE charts show counts within categories or show means or medians of continuous 
data against categorical data. 


LINE  x-varlist 
.* y-var * x-var 
y-varlist * x-var 
z-varlist * y-var * x-var 


The following options are available: 


/ MEDIAN displays the median for each variable in y-varlist (for a 2-D plot) or z-varlist 
(for a 3-D plot). 


PERCENT displays each count, mean, or median as the percentage of that group's 


contribution to the total. 
POLAR produces a line chart in polar coordinates. 

Examples: 

LINE SEX$ EDUCATN Two line charts of counts arranged side by side. One line 
chart displays the total number of males and females, 
and the second chart displays the total number of people 
in each education level. 

LINE . * EDUCATN * SEX$ A 3-D display where the height of each line represents 
the total number of males and females in each level of 
education. 

LINE INCOME * EDUCATN The lines represent average income for each level of 
education. 


LINE INCOME * EDUCATN * SEX$ A 3-D display representing average income for males 
and females in each level of education. 
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Common Options: 


LINE: Line charts 


ACOLOR 
DIRECTION 
FILL 


ST 
X/YIZFORMAT 
X/YIZMIN 


ALTITUDE 


XIYIZGRID 
XIYIZPIP 


AXES 
ERROR 
GROUP 
MATRIX 
SCALE 


COL 
ETHICK 
HEIGHT 
MULTIPLOT 
SERROR 


TICK TITLE 
XIYIZLABEL X/Y/ZLIMIT 


XIYIZPOW 


XIYIZREV 


XIYIZTICK 


DASH 
FCOLOR 
LLABEL 
PROJECT 
SPHERE 
WIDTH 
XIYIZMAX 
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MAP: Maps 


MAP: Maps 


MAP produces maps in oblique gnomonic, oblique stereographic, Mercator conformal, 
oblique orthographic, Lambert, Robinson, sinusoidal, Miller, and Peters projections. 
Users can use SYSTAT’s boundary map of the continental United States, obtain 
additional map files from SYSTAT, or create their own map file. Options are available 
for filling map polygons with colors, shading, or patterns to indicate the values of a 
variable (for example, average income within each country). It is also possible to 
include the value of a variable within a polygon by using contours or icons. 


MAP 


Draws boundary maps. To draw maps, you must have two files: one, a regular .SYZ 
data file with one case per region; and second, a special map boundary file with the 
extension .SMP. 


The following options are available: 


| PROJECT = longitude and latitude mapping projections: 
GNOMON oblique gnomonic 
STEREO oblique stereographic 
MERCATOR Mercator conformal (default) 
ORTHO oblique orthographic 


LAMBERT Lambert equal-area cylindrical 
ROBINSON Robinson 
SINUSOIDAL sinusoidal 


MILLER Miller cylindrical 
PETERS Peters 
FISHEYE fisheye 


SPHERE plots the map on a globe 
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Example: 


MAP / HEI = 3IN WID =4.5IN LABEL=STATES 


Common Options: 


MAP: Maps 


ACOLOR AXES COLOR CSIZE 
FILL HEIGHT LABEL LEGEND 
LTITLE SCALE STICK TICK 


X/YGRID X/YLABEL — X/YLIMIT X/Y LOG 
X/YPIP X/YPOW XIYREV X/YTICK 


DASH 
LLABEL 
WIDTH 
X/YMAX 


FCOLOR 


LOC 
X/YFORMAT 
X/YMIN 
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PARALLEL: Parallel coordinate displays 


PARALLEL: Parallel coordinate displays 


SEPALLEN SEPALWID PETALLEN PETALWID 


Produces a parallel coordinate plot that displays information for several variables in 
two dimensions. Specify two or more parallel variables. 


PARALLEL parallel-varlist 


The following option is available: 


| POLAR produces a parallel plot in polar coordinates 


Example: 


PARALLEL CARBO FAT PROTEIN produces a parallel coordinate dis 


Common Options: 


play with a profile line 


representing each case. Each profile line is defined by 


the values of the CARBO, FAT, а 


nd PROTEIN variables. 


ACOLOR 
FCOLOR 
LLABEL 
SLOPE 
X/YGRID 
YMAX 


X/YLABEL 
YFORMAT 


CSIZE 
LABEL 


DASH 
LEGEND 
SCALE 
WIDTH 
YLOG 
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PIE: Pie charts 


PIE: Pie charts 


Common Options: 


HS grad 


a | 


Some college 


PIE produces pie charts showing proportions of counts (if you specify a categorical 
variable only) or of means (if you specify a measure and a categorical variable). For 
counts, SYSTAT tallies the number of cases in each category, converts each count to a 
proportion of the total count, and shows that proportion with a wedge. For measures, 
SYSTAT computes a mean for each category, converts each mean to a proportion of 
the sum of the means, and shows that proportion as a wedge of the whole. 


PIE category-varlist 
slice-varlist * category-var 


The following options are available: 
| SLICE separates a specific slice from the whole pie. 
RING draws ring (attention) plots 
Examples: 
PIE EDUCATN displays a pie chart where the slices represent the proportion of 


the total number of people in each education level. 


PIE INCOME * EDUCATN produces a pie chart of average income for each level of education 
as a proportion of the total of the average incomes. 


ALTITUDE COL CSIZE FTITLE GROUP HEIGHT 
LABEL LOC ROW SCALE THREED TITLE 
WIDTH YLOG YPOW 
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oPLOT: 
Bivariate and 3-D scatterplots 


PLOT produces bivariate and 3-D scatterplots, including plots of variables against their 
case index and continuous variables against a categorical variable. 

Bivariate scatterplots show one or more variables on a vertical (у) axis against one 
variable on the horizontal (x) axis. 3-D scatterplots show one or more variables on a 
vertical (z) axis over an x-y grid, where one variable is defined on the x axis and the 
other on the y axis. You can also plot the values of one or more variables against the 
case index. 

When two or more y variables are specified for a bivariate plot (or two or more z 


variables for a 3-D plot), SYSTAT draws one plot for each. Use OVERLAY to combine 
them in a single frame. 


PLOT yvarlist 
yvarilist * xvar 
zvarlist * yvar * xvar 
wvarlist * zvar * yvar * xvar 
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oPLOT: Bivariate and 3-D scatterplots 


The following options are available: 


ITRI 
JITTER 


ELL=n 
ELM=n 
KERNEL=n 


HULL 

LINE 
VECTOR=x,y,Z 
SPIKE=n 


HEX=n 


plot in triangular coordinates 

moves data points by small uniform random amounts to expose points that 
overlap 

draws a Gaussian bivariate ellipsoid for the sample using the p value speci- 
fied by n (0 < n< 1). The default value is 0.6827. 

draws a Gaussian confidence interval of the bivariate centroid using the p 
value specified by n (0 « n « 1). The default value is 0.95. 

draws a confidence kernel of the p value specified by n 

(0 « n < 1).The default value is 0.6827. 

surrounds all points in cloud with convex hull 

connects plot points as cases are ordered in the file with a line 

draws a vector plot. Lines are drawn from the point (X, y) or (X, y, Z) to each 
data point. If you do not specify a point, the lines are drawn from the graph's 
origin. 

draws a vertical line from each point to the level on the vertical axis corre- 
sponding to п 

splits the xy-plane into n x n grids for hexagonal binning (grids are not dis- 
played).The default value is 25.Minimun value is 2 and maximum value is 50 
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oPLOT: Bivariate and 3-D scatterplots 


Smoothers and residuals from a smoother 


NEXPO 
INVERSE 
MEAN 


MEDIAN 
MODE 
MIDRANGE 


KRIGING 
ANGLE 
ORDER 
RATIO 


draws a smoother through the plot points 

fits E[y] = a + bx. The default. 

locally weighted scatterplot smoothing (not for 3-D plots). Can use with 
TENSION. 

distance-weighted least-squares smoothing. Can use with TENSION. 

fits Ely] = a + bx + cx? 

fits Ely] = a + b In(x) (not for 3-р plots) 

fits E[y] = axb (not for 3-D plots) 

fits several sections with y = a + bx + cx? + dx? (not for 3-D plots). Use for 
interpolating. Can use with TENSION. 

Starts at the point with the smallest x value and drops (or rises) to the point 
with the next larger value. Use for interpolating. 

negative exponentially weighted smoothing. Similar to DWLS. Use for 
interpolating. 

inverse tance smoothi E i ion. 
qud: im Ё oothing or Shepard’s method of interpolation 


running mean (moving average) smoother (not for 3-D plots). Can use with 
TENSION. i i "m 

running median smoother (not for 3-D plots). Can use with TENSION. 
modal smoother (not for 3-D plots). Can use with TENSION. 

interquartile range—that is, draw a line through the points (х1, y1) and (x3, 
y3) where the 1 and 3 indicate the first and third quartiles (not E 

3-D plots). Can use with TENSION. 


Kriging smoother 
angle parameter 
order parameter 
ratio parameter 
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oPLOT: Bivariate and 3-D scatterplots 


Robust smoothers 
ANDREWS These three smoothers use a V function to downweight the influence of 
BISQUARE cases with extreme residuals on the estimates of a and b in y = a + bx. 
HUBER 
TRIMMED running 50% trimmed mean (not for 3-D plots). Can use with TENSION. 

SHORT limits the domain of a smooth to the extreme data values on the horizontal 
axis. 

CONFI add the option CONFI=n for bands of confidence level n around the 
regression line. The default for CONFI is 0.95. 

TENSION=n controls stiffness for DWLS, INVERSE, KRIGING, LOWESS, MEAN, 
MEDIAN, MODE, TRIMMED, and SPLINE smoothing. A higher value 
of п uses more data points to smooth each value on the curve and makes 
the smooth stiffer. A lower n makes the smooth looser and more suscepti- 
ble to the influence of individual data points. Use 0 <= n <= 1, default 
value is 0.5. 

RESID=type plots residuals from a smoother (see one of the types listed above for 


SMOOTH 


Alternate 2-D displays and features 


BORDER = 


HIST 
BOX 

DIT 

DOT 
DOX 
GAP 
POLY 
NORMAL 


KERNEL 


FUZZY 

STRIPE 

JITTER 
HILO 


produces a scatterplot bordered by one of these univariate density dis- 
plays or a cumulative density display (as indicated, by adding CUM): 
histogram (also HIST CUM) 

box-and-whisker display. The NOTCH option is also available. 

dit plot 

symmetric dot plot 

box plot combined with a dot plot 

histogram with data-driven bar widths (also GAP CUM) 

frequency polygon (also POLY CUM) 

normal curve computed using the sample mean and standard deviation 
(also NORMAL CUM) 

univariate nonparametric kernel density estimator. The TENSION 
parameter is available (also KERNEL CUM). 

fuzzygram (also FUZZY CUM) 

density stripe 

jittered density graph 

produces a high-low-close plot. Specify three continuous variables for 


the y axis against a continuous variable or character variable for the x 
axis. The first variable gives the close value; the second, the high value; 


and the third, the low value. 
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oPLOT: Bivariate апа 3-D scatterplots 


INFLUENCE 


TSP 


VORONOI 
DELAUNAY 
POLAR 


PROJECT = 
GNOMON 
STEREO 
MERCATOR 
ORTHO 
LAMBERT 
ROBINSON 
SINUSOIDAL 
MILLER 
PETERS 
FISHEYE 

SPAN 

FLOWER 


produces a plot where the size of each point represents the influence of 
that point on the correlation between the two variables. Hollow symbols 


represent positive influence, and filled symbols represent negative influ- 
ence. 


plots the Traveling Salesman algorithm solution for the shortest path 
connecting the points given by the x and y variables you specify 
plots a Voronoi tessellation 

plots the smallest number of triangles that connect all points 


plots data in polar coordinates. The y scale is the distance of a point from 
the origin, and the x scale is the angle between the horizontal axis and a 
line from the origin to the point. 


longitude and latitude mapping projections: 
oblique gnomonic 

oblique stereographic 
Mercator conformal 

oblique orthographic 

Lambert equal area cylindrical 
Robinson 

sinusoidal 

Miller cylindrical 

Peters 

fisheye 

plots a minimum spanning tree 


produces a sunflower plot where the density of symbols is determined b 
the number of observations falling on each И” : 
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oPLOT: Bivariate and 3-D scatterplots 


Alternative 3-D displays and features 


TILE produces a colored (or filled if SURFACE=FILL) 2-D mosaic tiling of a 3-D 
smoothed surface. Default is 30 x 30 tiles (CUT=30). 

CONTOUR produces 2-D contour lines of a 3-D smoothed surface. Use ZTICK to control the 
number of solid contour lines (at major tick marks) and ZPIP to control the num- 
ber of dashed contour lines (at minor tick marks). Use CUT (default CUT=30) to 
control the resolution of contours. 

SPHERE plots data in spherical coordinates, using z = rho, у = phi, and x = theta 

Examples: 

PLOT INCOME produces a plot of the values of income on the 
y axis for each case identified on the x axis. 

PLOT INCOME * EDUCATN the values of income are plotted on the y axis for 
each level of education on the x axis. 

PLOT INCOME * EDUCATN * AGE displays a 3-D plot of the values of income on the z 


axis for each level of education on the y axis, by each 
value of age on the x axis. 


PLOT DEFECTS * PRESSURE * TEMP * produces a triangular plot of the number of defects 


DURATION 


Common Options: 


for each value of pressure by temperature by dura- 
tion. 


ACOLOR 
С 


TICK 
X/Y/ZLABEL 
XIYIZPOW 


Triangular plot options (w-axis) 


ALTITUDE AXES COL COLOR CSIZE 
DASH DUAL FCOLOR FILL FTITLE 
HEIGHT LABEL LEGEND — LLABEL LOC 
MATRIX MULTIPLOT OVERLAY REPEAT ROW 


SLOPE STICK SURFACE SYMBOL 
TITLE TRANSPOSE WIDTH XIYIZFORMAT X/Y/ZGRID 
X/YIZLIMIT X/Y/ZLOG XIYIZMAX — X/Y/ZMIN XIYIZPIP 
XYIZREV — XIYIZTICK 


WFORMAT 
WPIP 


WLOG WMAX WMIN 
WPOW WREV WTICK 
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PPLOT: Probability plots 


PPLOT: Probability plots 


10 Hd 100.0 
Probability plots are useful for studying the distribution of a variable. PPLOT plots the 
expected values of a variable against the corresponding percentage points of a 
theoretical distribution like normal, chi-square, ¢, F, uniform, binomial, logistic, 


exponential, gamma, Weibull, Gompertz, Gumbel or Studentized range, etc. Graphs 
like this are called probability plots, or P plots. You can also plot the expected values 
of one variable against those of another (P-P plot). SYSTAT can produce probability 
plots for 38 probability distributions, discrete and continuous together. 


PPLOT xvarlist 
yvarlist * xvar 


The following options are available: 


/| PSCALE displays probabilities (on the plot scale) instead of standard scores for 
the distribution 
CHISQ-df Chi-square distribution with df degrees of freedom 
EXPO Exponential distribution 
GAMMA=shp,sc gamma distribution with shape parametershp and scale parameter sc 
NORMAL normal distribution 
UNIFORM uniform distribution 


WEIBULL-sc,shp 
T=df 


Weibull distribution with scale parameter sc and shape parameter shp 
t distribution with df degrees of freedom 


Р=а ,df2 F distribution with df1 and @!2 degrees of freedom 
BETA=shp/, shp2 beta distribution with shape parameters Shp! and shp2 
LOGISTIC logistic distribution 

RANGE=k,df Studentized range with parameters К and df 
POISSON=/ambda Poisson distribution with mean lambda 
BINOMIAL=n,p binomial distribution with parameters п and р 
DUNIFORM=N discrete uniform distribution with parameter N 
NBINOMIAL=k, p negative binomial distribution with parameters k and p 
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GEOMETRIC=p 
HGEOMETRIC=N,m,n 
ZIPF-shp 
LNORMAL-/oc,sc 


PARETO=thr, shp 


RAYLEIGH-sc 
CAUCHY 
GUMBEL 

DEXP 
GOMPERTZ- b, с 
ENORMAL-/oc, sc 


IGAUSSIAN-/oc, SC 


TRIANGULAR=a, b, с 
JITTER 


ELL=n 
ELM=n 
KERNEL=n 


HULL 
LINE 
VECTOR=x,y 


INFLUENCE 


SPAN 
FLOWER 


TSP 


VORONOI 
DELAUNAY 


PPLOT: Probability plots 


geometric distribution with parameter p 

hypergeometric distribution with parameters М, m and n 

Zipf distribution with shape parameter shp 

lognormal distribution with location parameter with /oc and scale 
parameter SC 

sig distribution with threshold parameter thr and shape parameter 
shp 

Rayleigh distribution with scale parameter SC 

Cauchy distribution 

Gumbel distribution 

Double exponential distribution 

Gompertz distribution with parameters bandc 

logitnormal distribution with location parameter /oc and scale param- 
eter 5С. 

inverse Gaussian distribution with location parameter loc and scale 
parameter SC. 

triangular distribution with parameters à (min), b (max) and c (mode) 
moves data points by small uniform random amounts to expose points 
that overlap 


draws a Gaussian bivariate ellipsoid for the sample using the p value 
specified by n (0 <= n <= 1). Default is 0.6827 

draws a Gaussian confidence interval of the bivariate centroid using 
the p value specified byn (0 <= п <= 1). Default is 0.95 

draws a confidence kernel of the p value specified by n (0 <= п<= 1). 
Default is 0.6827 

surrounds all points in cloud with convex hull 

connects plot points as cases are ordered in the file with a line 

draws a vector plot. Lines are drawn from the point (x, y) to each data 
point. If you do not specify a point, the lines are drawn from the 
graph’s origin. 

produces a plot where the size of each point represents the influence 
of that point on the correlation between the two variables. Hollow 
symbols represent positive influence; filled symbols represent nega- 
tive influence. 

plots a minimum spanning tree 

produces a sunflower plot where the density of symbols is determined 
by the number of observations falling on each point 

tries to find the shortest possible closed path that connects all the 
points, with no repetitions 

plots a Voronoi tessellation 

plots the smallest number of triangles that connect all points 
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PPLOT: Probability plots 


BORDER = 


HIST 
BOX 


POLY 
NORMAL 


KERNEL 


FUZZY 

STRIPE 

JITTER 
SPIKE=n 


HEX=n 
RESID-method 


POLAR 
SMOOTH=method 


Available methods are: 


LINEAR SPLINE 
QUAD STEP 
LOG NEXPO 
POWER INVERSE 
LOWESS MEAN 
DWLS MEDIAN 


produces a probability plot bordered by one of these univariate den- 
sity displays or a cumulative density display (as indicated, by adding 
CUM): 

histogram (also HIST CUM) 

box-and-whisker display. The NOTCH option is also available 

dit plot 

symmetric dot plot 

box plot combined with a dot plot 

histogram with data-driven bar widths (also GAP CUM) 

frequency polygon (also POLY CUM) 


normal curve computed using the sample mean and standard devia- 
tion (also NORMAL CUM) 


univariate nonparametric kernel density estimator, The TENSION 
parameter is available (also KERNEL CUM) 


fuzzygram (also FUZZY CUM) 

density stripe 

jittered density graph 

draws a vertical line from each point to the level on the vertical axis 
corresponding to n 


splits the xy-plane into n x n grids for hexagonal binning (grids are not 
displayed).The default value is 25.Minimun value is 2 and maximum 
value is 50 


plots residuals from a smoother 
plots data in polar coordinates 
draws a smoother through the plot points, 


MODE KRIGING 
MIDRANGE 

ANDREWS 

BISQUARE 

HUBER 

TRIMMED 


plus SHORT, TENSION, CONFI (These methods are defined for the PLOT command) 
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PPLOT: Probability plots 


Examples: 


PPLOT GROWTH produces a probability plot which displays the values of 
the variable growth against the corresponding percentage 
points of a (NORMAL) theoretical distribution. 


PPLOT INCOME93 * ІМСОМЕ9О the expected values of the 1993 income variable are 
plotted against the expected values of 1990 income. 


Common Options: 


ACOLOR AXES COL COLOR CSIZE CUT 

DASH DUAL FCOLOR FILL FTITLE GROUP 
HEIGHT LABEL LEGEND LLABEL LOC LTITLE 
MULTIPLOT OVERLAY ROW SCALE SIZE SLOPE 
SPHERE STICK SYMBOL TICK TITLE TRANSPOSE 


WIDTH X/YFORMAT X/YGRID X/YLABEL X/YLIMIT X/YLOG 
X/YMIN XIYMAX X/YPIP XIYPOW X/YREV X/YTICK 
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PROFILE: Profile charts 


PROFILE: Profile charts 


a УУ? 


PROFILE charts show counts with categories or display means or medians of 
continuous data against categorical data. 


PROFILE — x-varlist 
.* y-var * x-var 
y-varlist * x-var 
z-varlist * y-var * x-var 


The following options are available: 


/ MEDIAN displays the median for each variable in y-varlist (for a 2-D plot) or z-varlist 
(for a 3-D plot). 


PERCENT displays each count, mean, or median as the percentage of that group’s 
contribution to the total. You may want to use STACK with this option. 


STACK profiles are stacked instead of overlaid. 
POLAR produces a chart in polar coordinates. 
Examples: 
PROFILE SEX$ EDUCATN Two profile charts of counts arranged side by side. 


One profile chart displays the total number of males 
and females, and the second chart displays the total 
number of people in each education level. 


A 3-D display where the height of each profile repre- 

sents the total number of males and females in each 

level of education. 

The profiles represent average income for each level 

of education. 

PROFILE INCOME * EDUCATN * SEX$A 3-D display representing average income for 
males and females in each level of education. 


PROFILE . * EDUCATN * SEX$ 


PROFILE INCOME * EDUCATN 
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Common Options: 


PROFILE: Profile charts 


ACOLOR 


THREED 
XIYIZGRID 
XIYIZPIP 


ALTITUDE AXES 
ERROR ETHICK 
GROUP HEIGHT 
MATRIX MULTIPLOT 
SCALE SERROR 
TICK TITLE 
XIYIZLABEL XY/ZLIMIT 
XIYIZIPOW | XIYIZREV 


COL 

ЕТҮРЕ 
LEGEND 
OVERLAY 
SLOPE 
TRANSPOSE 
XIYIZLOG 
XIYIZTICK 


COLOR 
FCOLOR 
LLABEL 
PROJECT 
SPHERE 
WIDTH 
XIYIZMAX 


DIRECTION 
FILL 

LOC 

REPEAT 
STICK 
X/YIZFORMAT 
X/Y/ZMIN 
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PYRAMID: Pyramid charts 


PYRAMID: Pyramid charts 


3 


INCOME INCOME 


SSSSSISoSBSESS 


2 31239 49 $ 7 6 
EDUCATN 
PYRAMID charts show counts of categories or show means or medians for continuous 
data against categorical data. 


PYRAMID — x-varlist 
.* y-var * x-var 
y-varlist * x-var 
z-varlist * y-var * x-var 


The following options are available: 


1 MEDIAN displays the median for each variable in y-varlist (for a 2-D plot) or z-varlist (for 
a 3-D plot). 
PERCENT each category represents the percentage of that group's contribution to the total. 
BASE=n anchors base at n. 


BTHICK=n controls the thickness of the pyramid base as compared to the amount of space 
between tick marks. Specify (0 <= n <= 1). 


POLAR produces a chart in polar coordinates. 
Examples: 
PYRAMID SEX$ EDUCATN Two pyramid charts of counts arranged side by side. 


One pyramid chart displays the total number of 
males and females, and the second chart displays 
the total number of people in each education level. 

PYRAMID . * EDUCATN * SEX$ A 3-D display where the height of each pyramid 
represents the total number of males and females in 
each level of education. 

PYRAMID INCOME * EDUCATN The pyramids represent average income for each 
level of education. 

PYRAMID INCOME * EDUCATN * SEX$A 3-D display representing average income for 
males and females in each level of education. 
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Common Options: 


PYRAMID: Pyramid charts 


ACOLOR 
DIRECTION 
FILL 

LLABEL 
PROJECT 
SPHERE 
TRAINGULAR 
X/Y/ZLOG 
XIYIZTICK 


ALTITUDE 


THREED 
XIYIZFORMAT 
X/YIZMIN 


COL 
ETHICK 
HEIGHT 
MATRIX 
SCALE 


TICK 
XIYIZGRID 
XIYIZPIP 


COLOR CSIZE 
ETYPE FCOLOR 
LABEL LEGEND 
MULTIPLOT OVERLAY 
SERROR SLOPE 
TITLE TRANSPOSE 
XIYIZLABEL XY/ZLIMIT 
XIYIZIPOWN X/YIZREV 
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ОРІОТ: Quantile plots 


QPLOT: Quantile plots 


50 60 70 80 
LIFEEXPM 


QPLOT produces quantile plots (Q plots). Unlike probability plots, which compare a 
sample to a theoretical probability distribution, a quantile plot compares a sample to its 
own quantiles (a one-sample plot) or to another sample (a two-sample or Q-Q plot). 
The quantile of a sample is the data point corresponding to a given fraction of the data. 


QPLOT xvarlist 
yvarlist * xvar 


The following options are available: 


1 SMOOTH=method draws a smoother through the plot points. Available methods are: 


LINEAR SPLINE MODE KRIGING 
QUAD STEP MIDRANGE 

LOG NEXPO ANDREWS 

POWER INVERSE BISQUARE 

LOWESS MEAN HUBER 


DWLS MEDIAN TRIMMED 
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RESID=method 
BORDER = 


HIST 
BOX 

DIT 

DOT 
DOX 
GAP 
POLY 
NORMAL 


KERNEL 


FUZZY 

STRIPE 

JITTER 
JITTER 


ELL=n 


ELM=n 


KERNEL=n 


HULL 
LINE 


QPLOT: Quantile plots 


plus SHORT, TENSION, CONFI. (These methods are defined for the 
PLOT command.) 


plots residuals from a smoother 

produces a quantile plot bordered by one of these univariate density 
CO. or a cumulative density display (as indicated, by adding 
histogram (also HIST CUM) 

box-and-whisker display. The NOTCH option is also available. 

dit plot 

symmetric dot plot 

box plot combined with a dot plot 

histogram with data-driven bar widths (also GAP CUM) 

frequency polygon (also POLY CUM) 

normal curve computed using the sample mean and standard deviation 
(also NORMAL CUM) 

univariate nonparametric kernel density estimator. The TENSION 
parameter is available (also KERNEL CUM). 

fuzzygram (also FUZZY CUM) 

density stripe 

jittered density graph 

moves data points by small uniform random amounts to expose points 
that overlap 


draws a Gaussian bivariate ellipsoid for the sample using the p value 
specified by п (0 <= п <= 1). Default is 0.6827. 


draws a Gaussian confidence interval of the bivariate centroid using 
the p value specified by п (0 <= п <= 1). Default is 0.95. 

draws a confidence kernel of the p value specified by n (0 <= п <= 1). 
Default is 0.6827. 

surrounds all points in cloud with convex hull 

connects plot points as cases are ordered in the file with a line 
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QPLOT: Quantile plots 


VECTOR=x,y 


INFLUENCE 


SPAN 
FLOWER 


TSP 


VORONOI 
DELAUNAY 
SPIKE-n 


HEX=n 


POLAR 


Examples: 


draws a vector plot. Lines are drawn from the point (X, y) to each data 
point. If you do not specify a point, the lines are drawn from the 
graph’s origin. 

produces a plot where the size of each point represents the influence of 
that point on the correlation between the two variables. Hollow sym- 
bols represent positive influence and filled symbols represent negative 
influence. 

plots a minimum spanning tree 

produces a sunflower plot where the density of symbols is determined 
by the number of observations falling on each point 

tries to find the shortest possible closed path that connects all the 
points, with no repetitions 

plots a Voronoi tessellation 

plots the smallest number of triangles that connect all points 

draws a vertical line from each point to the level on the vertical axis 
corresponding to n 


splits the xy-plane into Л x n grids for hexagonal binning (grids are not 
пану default value is 25.Minimun value is 2 and maximum 
value is 5 


plots data in polar coordinates 


QPLOT INCOME EDUCATN produces two quantile plots arranged side by side. One plot 


compares income to its own quantiles, and the other 
compares education to its own quantiles.. 


QPLOT INCOME93 * INCOME90 pe Son nd of 1993 income are plotted against the quantiles 
o 


income. 


Common Options: 
ACOLOR AXES COL COLOR CSIZE 
DASH DUAL FCOLOR FILL FTITLE GROUP 
HEIGHT LABEL LEGEND LLABEL LOC LTITLE 
MULTIPLOT OVERLAY PROJECT ROW SCALE SIZE 
SLOPE SPHERE STICK SYMBOL TICK TITLE 
TRANSPOSE WIDTH X/YFORMAT X/YGRID X/YLABEL X/YLIMIT 
X/YLOG X/YMAX X/YMIN X/YPIP XIYPOW XIYREV 


X/YTICK 
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SPLOM: Scatterplot matrices 


WORKWEEK PCTTAXES BIG MAC 


mA! 
hb 


[> lif 
WORKWEEK PCTTAXES BIG MAC 


S3xvliOd X33ADROM 


SPLOM: Scatterplot matrices 


BIG МАС PCTTAXES WORKWEEK 
ETETI 


SPLOMs show separate plots for each possible pair of the variables you select in a 
single matrix display. SPLOM stands for "ScatterPLOt Matrix." SPLOMs are also 


called casement plots. 


SPLOM row-varlist 
row-varlist * column-varlist 


The following options are available: 


| HALF omits the half of the SPLOM above the diagonal 
DENSITY = specifies the type of density display (add CUM for cumulative) for the 
diagonal of the matrix. Default: DENSITY=HIST 

HIST histogram (also HIST CUM) 
BOX box-and-whisker display. 
DIT dit plot 
DOT symmetric dot plot 
DOX box plot combined with a dot plot 
GAP histogram with data-driven bar widths (also GAP CUM) 
POLY frequency polygon (also POLY CUM) 


NORMAL normal curve computed using the sample mean and standard deviation 
(also NORMAL CUM). Bars are omitted. 


KERNEL univariate nonparametric kernel density estimator (also KERNEL 
CUM). 

FUZZY fuzzygram (also FUZZY CUM) 

STRIPE density stripe 

JITTER jittered density graph 


CUM for DENSITY = HIST, GAP, POLY, FUZZY, NORMAL or KERNEL 
used to display cumulative densities. 
NOT use with DENSITY = BOX or DOX to insert notches that mark confi- 


dence intervals. 
SMOOTH-method draws а smoother through the plot points. Available methods are: 


110 


SPLOM: Scatterplot matrices 


LINEAR SPLINE MODE TRIMMED 
LOWESS STEP MIDRANGE DWLS 
NEXPO KRIGING QUAD INVERSE 
ANDREWS LOG MEAN BISQUARE 
POWER MEDIAN HUBER 


plus SHORT, TENSION, and CONFI. (These methods are defined 
for the PLOT command.) 


ELL=n draws a Gaussian bivariate ellipsoid for the sample using the p value 
specified by n (0< n <1). Default is 0.6827. 

ELM=n draws a Gaussian confidence interval of the bivariate centroid using the 
p value specified bn (0< n <1). Default is 0.95. 

KERNEL=n draws a confidence kernel of the p value specified byn 
(0< n <1). Default is 0.6827. 

HULL surrounds all points in cloud with convex hull 

FLOWER produces a sunflower plot where the density of symbols is determined 
by the number of observations falling on each point 

INFLUENCE produces a plot where the size of each point represents the influence of 
that point on the correlation between the two variables. Hollow symbols 
represent positive influence and filled symbols represent negative influ- 
ence. 

JITTER moves data points by a small uniform random amount 

SPAN plots a minimum spanning tree 

SPIKE=n draws a vertical line from each point to the level on the vertical axis cor- 
responding to n 

VECTOR=x,y draws a vector plot. Lines are drawn from the point (x,y) to each data 
point. If you do not specify a point, the lines are drawn from the graph’s 
origin. 

TSP tries to find the shortest possible closed path that connects all the points, 
with no repetitions. 

VORONOI plots a Voronoi tessellation 

DELAUNAY plots the smallest number of triangles that connect all points 

LINE connects plot points as cases are ordered in the file with a line 

Examples: 
SPLOM HEALTH EDUC MIL 


SPLOM HEALTH EDUC MIL * GDP. CAP 


creates a matrix containing individual scatterplots of 
the health, education, and military variables against 
each other. Each variable appears on a row and a 
column of the Scatterplot matrix. Instead of plotting 
a variable against itself, the diagonal of the matrix 
displays the density of each variable. 


produces a scatterplot matrix with health, education, 
and military as the row variables against only one 
column variable, gross domestic product. 
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Common Options: 


SPLOM: Scatterplot matrices 


ACOLOR 
FCOLOR 
LEGEND 
SIZE 
X/YLABEL 
X/YTICK 


COL 
FILL 
LLABEL 
SYMBOL 
X/YLOG 


COLOR 
FTITLE 
LOC 
TICK 
X/YMAX 


CSIZE 
GROUP 
LTITLE 
TITLE 
X/YMIN 


CUT DASH 
HEIGHT LABEL 
OVERLAY | ROW 
TRANSPOSE WIDTH 
X/YPOW X/YREV 
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WRITE: Writing text 


WRITE: Writing text 
The WRITE command writes text. 


WRITE ‘text ' 
The following options are available: 


| LOC=x,y Елу location of written string. 
ANGLE=n tilts text at п degree angle. 


HEIGHT=n unit specifies the font size in default units, or in 
points (PT), inches (IN), or centimeters 
(CM) by including the unit. 


You can produce superscripts, subscripts, and other special features with the WRITE 
command by embedding backslash codes inside the text string. The available 
backslash codes are: 


Backslash code Effect Backslash code — Effect 

M Stroke font \* overstrike 

\2 Swiss font \> superscript 

\3 British font \< subscript 

\4 Hershey font \+ up one level 

\5 Greek font M down one level 
\H double height = normal level and size 
\h half height V italic 

\w double width Y roman 

Ww half width M backslash 
Examples: 


WRITE '1994 AVERAGE INCOME' / HEIGHT = .2IN 
WRITE 'VY \|= VMX \ + VB' (produces the slope-intercept line equation with Y, M, X, and B 


italicized.) 
PLOT Y*X / TITLE="VITALICS TITLE" XLABEL="\5M" 


Common Options: 


COLOR HEIGHT LOC WIDTH 
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Global Features for Graphs 


Global features affect all graphs until you change them or restart SYSTAT. The setup 
for using global commands is: 


USE filename 
global commands (one per line) 
graph commands (one or more) 


Overlaying, sizing, and positioning displays 


BEGIN/END commands 


BEGIN 
graph commands 
END 


Use BEGIN. ..END to overlay plots or, when combined with the LOC option, to position 
two or more graphs in a single display. That is, precede your set of commands with 
BEGIN and complete them with END. The optional tree heading entry defines text to 
display in the Output Organizer instead of “Begin/End Plot". You can USE multiple 
data files within a BEGIN...END. In addition, you can SELECT different cases for each 
graph created. However, only one EYE setting should be used. 


ORIGIN command 


ORIGIN x unit y unit 


Positions the origin of the plot on the page (that is, the lower left corner of the 
display).The default position is 2.25 IN, -5.5 IN. The unit can be inches (IN) or 
centimeters (CM). The default is IN. ORIGIN changes the position of all subsequent 
graphs. To change the position of an individual graph, use the LOC option. 
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Overlaying, sizing, and positioning displays 


RIGIN 2.25 IN, -5.5 IN 
(default) 


LOC = OIN, OIN 


Examples: 


ORIGIN 0 IN 3 IN 


ORIGIN 2.6 CM 9 CM 


SCALE command 


SCALE x y 


Controls the height-width proportions, as in a photocopy reduction or enlargement. 
The unit is percentage of the standard size. 


Examples: 
SCALE 90 80 


SCALE 120 120 


EYE command 


EYE x, y, z/ RECTANGULAR 
EYE Ө degrees, ф degrees, radius | SPHERICAL 
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Graph appearance features 


Controls the perspective for viewing a three-dimensional plot. 


| RECTANGULAR- х, у, 2 Defaults аге approximately x=-6, y=-8, 2=6 


SPHERICAL= Ө, ф sf O rotates (in °) horizontally (longitude), ф rotates (in °) 
vertically (latitude), and r determines how close to center of 
object. Radius unit is optional. Defaults are approximately 
Ө =-125,  =30, and radius=16. 


FACET command 
FACET XY 
XZ 
YZ 


Specifies the plane onto which subsequent two-dimensional graphs are plotted. This is 
used for overlaying 2-D plots in 3-D perspective. 


Graph appearance features 


FONT command 


FONT STROKE 
SWISS 
BRITISH 
HERSHEY 
GREEK 


Specifies the typeface (or font) for your graph. For italic type, add the ITALIC option. 
Fonts are mapped to Windows TrueType fonts. 
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Graph appearance features 


Command TrueType Regular and Italic Examples 
argument Font 


STROKE Arial (default) Stroke Aa Bb Cc Dd 1234567890 
Stroke Aa Bb Cc Dd 1234567890 


SWISS Arial Bold — Swiss Aa Bb Cc Dd 1234567890 
Swiss Aa Bb Cc Dd 1234567890 


BRITISH Times British Aa Bb Cc Dd 1234567890 
New Roman British Aa Bb Cc Dd 1234567890 


HERSHEY Bookman Hershey Aa Bb Cc Dd 1234567890 
Old Style Hershey Aa Bb Cc Dd 1234567890 


GREEK Symbol Tpeex Aa BB Ху Ad 1234567890 
Греєк Aa BB Ху AS 1234567890 


CSIZE command 


CSIZE n 


Controls the character size of graph scales, axes labels, legends, and titles. To return to 
default character size, type CSIZE without an argument. The default value is 1; 2 is 
twice the usual size, 0.5 is half the usual size. To control the size of characters used 
inside graphs (labels or letters used as plot symbols), use the CSIZE option. 


DEPTH command 


DEPTH n unit 


Controls the position ofa plane along a facet. The unit can be inches (IN) or centimeters 


(CM). DEPTH 3 IN specifies a 3-inch depth, and DEPTH (with no specification) returns 
to the default value. 


THICK command 


THICK n 


Sets the thickness of lines on a graph. The default value is 1; 2 is twice as thick. 
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Graph appearance features 


MONOCHROME command 


MONOCHROME ON 


Sets the color palette to 256 shades of gray for all subsequently created graphs. Use 
MONOCHROME OFF to create graphs using a color palette. 
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Local Options for Graphs 


Use the options listed here to customize SYSTAT graphical displays. Type the option 
list following a slash. For example: 


BAR INCOME * REGIONS / FILL=3 HEIGHT-4IN 


For FPLOT, use a semicolon in place of the slash. 


These options apply to the current display only. 


X, Y, and Z options 


Many of SYSTAT's options are preceded by X, Y, or Z, such as MIN, MAX, LOG, TICK, 
PIP, REV, LIMIT, LABEL, and FORMAT. X, Y, and Z are assigned as follows: 


PLOT var3 * var * var 
2 Жы AT X 


Note that the X, Y, or Z specification does not necessarily refer to the x, y, or z axis, but 


rather the X, Y, or Z variable. (For example, the TRANSPOSE option rotates a plot 90 
degrees, putting the X variable on the y axis.) 
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Local Options for Graphs 


Plot size 
/HEIGHT=n unit specifies the physical height and width of the plot (and altitude for a 3-D 
WIDTH=n unit рор а ма сап be IN eye CM (centimeters), or PT (72nd of an 
ы . inch). If you do not specify a unit, the plot is 4IN by 4IN. Example: 
ALTITUDE=n unit 4e1231N. WID-6IN р 
Plot position 


/ LOC=n1,n2 


specifies the location of the plot as x,y coordinates from the ORIGIN 
(the point on the page where the lower left corner of the plot is drawn). 
For example, to move a plot | inch to the right and 2 inches up from the 
ORIGIN, specify LOC=1IN, 2IN. 


Axis labels, legends, and titles 


/ XLABEL-'text ' 
YLABEL- text ' 
ZLABEL-' text ' 


LEGEND-x unit 
y unit 
NONE 


labels the axis with the text string specified by fext. Example: 
XLABEL- Birth Rate’ 


determines the location of the lower left corner of the legend. 
The x and y values refer to the distance from the origin. The 
default unit is IN (inches). You can also specify CM (for centi- 
meters) or PT (for points). The default placement is at the lower 
right corner of the graph. The size of the legend depends on the 
size of the plot and CSIZE. To delete the legend, use NONE. 
Example: LEGEND=3IN SIN 


LLABEL-'text1 ', 'text2 '... prints labels up to 80 characters long for each value in the legend 


LTITLE="text' 


TITLE-'text ' 


FTITLE = off 
FTITLE = on 


of a multivalued graph. 
Example: LLABEL-'Income','Budget" 


places a title above the legend. 

Example: LTITLE-"Region of the World" 

prints text above the graph. 

Example: TITLE-"Taxes Paid in '92" 

hides titles of frames for the frames generated by GROUP option 
displays titles of frames for the frames generated by GROUP 
option (if this option is not mentioned, by default frame titles are 
shown) 
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Local Options for Graphs 


Plot symbols 


/ SIZE=n 
var 


SYMBOL=n1,n2.... 


charlist 
varlist 


n is a symbol size multiplier. If you specify a variable, symbols are 
multiplied by the value of var for each case. If you specify SIZE=0, 
no points are printed; for SIZE=2, the size is double that of the 
default size. Example: SIZE=1.5 


specifies plot symbols. Specify one number, character string, or vari- 
able for each y variable you plot. Numbers specify preset symbols 

(0 <n<24).You can specify character or numeric variables. If you 
specify a character variable, SYSTAT uses the first letter of each 
value as the plot symbol. Numeric variables should contain values 
between 1 and 23 which are plotted with the corresponding SY STAT 
plot symbols (see below). Examples: SYM=2,3,5 or SYM='M','F' 


OX-FAV-«IDPLÜIOYrO | — 0 Ak o^ 9 оО 


putas Me SASUKE e qu d d dd 
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Local Options for Graphs 


Scales and Axes 


/ XMAX=n specifies the maximum scale value. Example: XMAX=10.6 
YMAX=n 
ZMAX=n 
WMAX-n 
XMIN=n specifies the minimum scale value. Example: XMIN=0 
YMIN=n 
ZMIN=n 
WMIN=n 
AXES=type specifies the number of axes to print, using the following types: 
2-D Axes 
Box Vertical Horizontal None 
Bottom Top Left Right 


| n ' 
3-D Axes 
k Book Box None 
Fork Spoon Corner Cross 


ооа d s 
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Local Options for Graphs 


SCALE=type 


TRANSPOSE 
XTICK=n 
YTICK=n 
ZTICK=n 
WTICK=n 
STICK=IN 
OUT 
THROUGH 
TICK=FLUSH 
FLOAT 
INDENT 


XPIP=n 
YPIP=n 
ZPIP=n 
WPIP=n 


Example: AXES=RIGHT 


For polar coordinates, specify AXES=BOTTOM for the circle axis 
alone, AXES=LEFT for the radial axis, and AXES=NONE for no axes. 
The default is to display both axes for polar coordinates. 


specifies which axes to print scales on, using the same types shown in 
AXES above. Example: SCALE=TOP 


rotates plot 90 degrees 
divides the axis inton intervals. Example: XTICK=6 


forces tick marks inside, outside or through the graph frame (axes). 
The default is inside the frame. Example: STICK=OUT 


locates tick marks in relation to the ends of the axis, To prevent the tick 
marks on the x and y axes from overlapping, specify INDENT to indent 


bers and the data fills 90% of the frame. Example: TICK=INDENT 
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Transforming data 


Local Options for Graphs 


/ XLOG=n 
YLOG=n 
ZLOG=n 
WLOG=n 
XPOW=n 
YPOW=n 
ZPOW=n 
WPOW=n 


logs the data to the base п before plotting. Use a real number such that 
x же n<=10. XLOG with no number is log base 10. Example: 
G-8 


the power n to which each data value is raised. If you don't specify a 
value for n, 0.5 is used for a square root transformation. Example: 
XPOW=2 to square each data value. 


Multivariable and multigroup displays 


/ GROUP=grpvar1, 


grpvar2 
MULTIPLOT 


DUAL=grpvar 


REPEAT 


MATRIX 
OVERLAY 
ROW=n 
COL=n 


Ordering categories 


forms a separate display for each unique combination of levels of one or 
two grpvars. 

produces a multiplot layout, similar to a Trellis, when the GROUP 
option is used. 

mirrors (back-to-back) display for two groups. grpvar can have two 
levels only, otherwise it will take first two levels in ascending order. 

for multivariable display, places the levels of the repeated measures 
along the x axis. For example: 

BAR y-var1,y-var2, y-var3 | GROUP=grpvar REPEAT OVERLAY 
produces one frame with the bars for each group of y-var1 first, followed 
by the bars for the groups of y-var2, and so on. 

plots all values of a data matrix against their row and column indices. For 
example: BAR z-varlist | MATRIX 

for multivariable displays (or multiple groups), the variables (or groups) 
are displayed within a single frame. 

number of rows in multipanel displays. 

number of columns in multipanel displays. 


Use the ORDER command to define and order categories for graphical displays, 
especially those with character codes. 


| XREV 
YREV 
ZREV 
WREV 


reverses the scale 
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Local Options for Graphs 


Error bars 


/ SERROR=n or standard error bars. Specify 0 <= n <=] for the confidence level or 
vars request that SYSTAT use the values ofthe vars listed. For one standard 
error of the mean (the default), specify n — 0.6827; for two standard 
errors, N=0.9545, etc, i 
ERROR=n or error bars in standard deviation units. Specify 0 <= n <= 1 for the confi- 
vars dence level or request that SYSTAT use the values of the vars listed. For 
one standard error of the mean (the default), specify n = 0.6827; two 
standard errors, n = 0.9545. etc. l 
DIRECTION=var variable or vaiables specifying direction of error bar. Values of this vari- 
able control direction (0 = none, positive number = up, negative number 
= down). 
ETYPE=TICK type of error bar. TICK, the default, is a vertical line bounded by horizon- 
BOX tal ticks. BOX is a vertical box. 


ETHICK=n length of horizontal ticks that bound the error bar or width of the error 
box (ETYPE-BOX). Specify 0 < n <= 1, where for 1.0, the tick (or box) 
is as wide as a default symbol in a dot plot; 0.5, the length (or width) is 
50% of what a Symbol would be, and SO on. Default is 0.5. 


Color, lines, fill patterns, and control limits 


/COLOR or colors the elements of a рга 
FCOLOR or ,, (bars, points, lines, etc.), ACOLOR to color axes, scales, and labels, and 
ACOLOR=colorlist FCOLOR to color the 

numiist опе of the first ten SYSTAT-defined names Ог one of 12 integers, Avail- 
varlist — able colors include: red (1) , blue (2), green (3), yellow (4), orange (5), 
brown (6), gray (7), violet (8), white (9), black (10), cyan (11), and 
magenta (12), Decimal values between () and 1 represent spectral colors 
tween blue and red. Character or numeric variables which contain 
these numbers or color names can also be used. Example: 


COLOR=green 
FILL=n1,n2, ... Specifies fill patterns for symbols, bars, etc, Specify a fill pattern for 
varlist each variable ог group. Use an Integer between | and 7, inclusive, for 


legends incase graphs that do not have an enclosed region. 
Example: FILL=1,2 


ANNs 


9,7 ч 29 ace 


125 


Local Options for Graphs 


Specify a decimal number between 0 and | for a percentage “screen” of 
black. Use a variable name to use the values of that variable to deter- 
mine shading. As an alternative, you can use the COLOR option to pro- 
duce shades of gray for black and white printing. 

Example: / COLOR=BLUE,GREEN, YELLOW 


XGRID draws grid lines extending from tick marks. For 2-D plots, XGRID spec- 
YGRID ifies vertical grid lines and YGRID, horizontal. The default is no grid. 
ZGRID Example: XGRID 
XLIMIT=n,p adds dashed lines to axes to mark control limits. If you specify only one 
YLIMIT=n,p number, only one limit line will be drawn. 
ZLIMIT=n,p Example: XLIMIT=44,90 
DASH=n specifies type of line used in a plot elements (including lines connecting 
points, smoothers, ellipses, limits, etc.). Enter a number for each y vari- 
able plotted (or group). Example: DASH=3,5 
1 
es 
3— a ee a —. 
4 ——————————— 
5—————————— — 
6-------+-------- 
7 ------------------- 
B= - 6 wy ai owed жй» = = = 
9 = we eee ee E ә ч Fen ams Rs Se =® лә; жь = 
10 --------------------------- 
1 1 ——————————————— DD 
1 2 TESTEM IIT hy id isis LA) 
SLOPE Cleveland Median Slope Adjustment. This option automatically scales line 


graphs for data such as time series. It adjusts the height and width of the 
graph so that the median absolute physical slope of the plotted line seg- 
ments is one. 

THREED adds a pseudo depth to a 2-D display, making it 3-D. 
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Local Options for Graphs 


Formatting scales 
/ XFORMAT=n writes numbers or plot scales with n digits following the 
YFORMAT=n decimal point. Use 0 <= n <= 9, 
ZFORMAT=n 
WFORMAT=n 
XFORMAT= picture format’ writes numbers or plot scales in the given format. Include 
YFORMAT:- picture format’ one symbol for each character, digit, or symbol that you 


ZFORMAT='picture format’ 


WFORMAT= picture format’ 


Labels (within) 


want to print. Use a 
a number and a 
needed. Use Y 
(minutes), 


pound sign (#) to indicate each digit of 
period (.) to indicate the decimal point, if 
(years), M (months), D (days), h (hours), m 
and s (seconds) for dates and times. 


/ LABEL-var$ 
LABEL 


CENTER 
CSIZE 


labels each case with the value of var$. 
Example: LABEL=country$ 


for BAR, PYRAMID, or PIE, prints the count (or mean) at the 
top or edge of each bar, pyramid, or slice. 

centers labels on data points. 

specifies size of characters inside graphs (labels or letters used 
as plot symbols) without affecting the character size of the 
graph scales, axes labels, legends, or titles (use the CSIZE 
command to specify the size of these elements). 


Surfaces 
/ SURFACE=XCUT for surface plots, Specifies the direction of the surface cuts. 
YCUT XCUT draws cuts Perpendicular to the x axis. YCUT craws cuts 
po EM p Ше ^ ot ZCUT draws cuts perpendicular to 
е 2 axis. Use to automatically col ас 

COLOR FILL to fill the surface with patiemns. енка 
FILL 

Curves 


/ CUT=n 


controls smoothers of a 2-D smoother or function. 


Section 


3 
Statistics 


This section alphabetically lists the statistical procedures available in SYSTAT. The 
features and options of each procedure are defined. An asterisk denotes required 
commands. 
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ANOVA: Analysis of Variance 


ANOVA: 
Analysis of Variance 


Setup: 


SYSTAT provides two procedures for analysis of variance: Analysis of Variance 
(ANOVA) and General Linear Model (GLM). ANOVA is easier to use, because it includes 
all interactions in the model and tests them automatically. You can specify covariates, 
do repeated measures, save residuals, and test post hoc pairwise differences in means. 

Group sizes can be unequal for combinations of grouping factors, but each subject 
must have complete data across repeated measures. You can use numeric or character 
values to code grouping variables. You can store results of the analysis (predicted 
values and residuals) for further study and graphical display. In ANCOVA (using 
COVARIATE), you can save adjusted cell means. 

See GLM for randomized block designs, incomplete block designs, fractional 
factorials, Latin square designs, split plot, crossover designs, nesting, etc. Also, use 
GLM to test any contrast across cell means, including simple effects. 


* ANOVA 
t БЕ 
* DEPEND 
CATEGORY 
COVARIATE 
PLENGTH 
SAVE 
НОТ * ESTIMATE 
The m i i 
2 wh тугойу tie rea азл — are located on the General Linear 
HYPOTHESIS 
POST 
ERROR 
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ANOVA: Analysis of Variance 


EFFECT 
ALL 
WITHIN 
SPECIFY 
CONTRAST [matrix] 
AMATRIX [matrix] 
CMATRIX [matrix] 
DMATRIX [matrix] 
PAIRWISE 

HOT TEST 


Model Building 


For one-way and factorial designs (and some repeated measures designs), use 
CATEGORY, ANOVA, and ESTIMATE. For ANCOVA, insert COVARIATE before 
ESTIMATE. 


Note: To assign a specific order to the categories of a grouping variable, use ORDER 
command. All subsequent output reports and contrasts assume this order. 


DEPEND command 


DEPEND varlist 


Specifies var as a dependent variable in a fully factorial analysis of variance, or varf, 
var2... for repeated measures. 


/REPEAT= m, n, ... number of levels of each repeated measures factor. 
E m and n are as defined above. Use xi and yi to specify the 
REPEAT = т p "x e) scale for orthogonal polynomials. Specify one number for 
n (yt, y2, ..) each level of the repeated measure. 


NAMES = 'пате1', 'name?2 ', ... a name for each repeated measures factor. 


Note: To display results for single degree of freedom polynomial contrasts for each 
repeated measure factor, insert PLENGTH MEDIUM before ESTIMATE. 
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ANOVA: Analysis of Variance 


Examples: 


DEPEND INCOME 


DEPEND TIME1 TIME2 TIME3 TIME4 / REPEAT 4(1, 2, 4, 8) NAMES = 'DAY' 


CATEGORY command 


CATEGORY grpvarlist 


Specifies numeric or string grouping variables to define cells. All subsequent contrasts 
assume the sorted ordering of the levels. 


/ MISS adds a category for cases with a missing value for the categorical variable in the 
analysis. 
EFFECT produces parameter estimates which are differences from group means. 
DUMMY produces dummy codes for the design variables instead of effect codes. 


Example: 


CAT SEX$ EDUCATN 


COVARIATE command 


COVARIATE varlist 
Specifies variables as covariates in a fully factorial analysis of variance. 
Example: 


COVAR AGE 
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ANOVA: Analysis of Variance 


SAVE command 


SAVE filename 
Saves specified statistics to a file. 


/ ADJUST adjusted cell means from analysis of covariance. 
COEFFICIENT estimates of the regression coefficients. 
specifies the variables from your MODEL statement in addition to the 


MODEL statistics given by RESID. 

PARTIAL partial residuals. 

RESID produces residuals, predicted cell means, and diagnostics. 
DATA saves data from original file. 


PLENGTH command 


PLENGTH SHORT 
MEDIUM 
LONG 


To request extended results, enter PLENGTH prior to ESTIMATE. SHORT prints the 
ANOVA table. The MEDIUM length adds least-squares means to the output. LONG adds 
estimates of the coefficients. 
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ESTIMATE command 


ESTIMATE 
Estimate the analysis specified in DEPEND. 


/ CONFl=n displays the confidence intervals of the regression coefficients at the 
desired level of confidence. The confidence intervals are displayed only 
when the MEDIUM and LONG options of PLENGTH are active. The 
default level is 0.95. 


NTEST =KS Kolmogorov-Smirnov test (Lilliefors) for testing normality. 
SW Shapiro-Wilk for testing normality. 
AD Anderson-Darling for testing normality. 
HTEST=LEVENE Levene's test for homogeneity of variance. 
SS - TYPE 1 uses type I sum of squares for analysis. 
TYPE2 uses type II sum of squares for analysis. 
ТҮРЕ 3 uses type III sum of square for analysis. 


QUICK displays quick graphs even though global graph option is off. 
NOQUICK suppresses quick graphs even though global option is on. 
Hypothesis Testing 
HYPOTHESIS command 
HYPOTHESIS 


Tests hypotheses on a previous MODEL. You must enter HYPOTHESIS before you can 
use any of the following: 


POST CONTRAST 
ERROR AMATRIX 
EFFECT CMATRIX 
ALL DMATRIX 
WITHIN PAIRWISE 
SPECIFY TEST 


For descriptions, see GLM (General Linear Model). 
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ANOVA: Analysis of Variance 


Toggling among command line and GUI is supported in ANOVA, GLM, MANOVA, 
MIXED, LOGIT and RSM. That is, if estimation is performed through dialog box then 
post estimation analysis can be performed through commands and vice-versa. 
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BAYESIAN: 
Bayesian Regression 


BAYESIAN computes Bayesian estimates of the regression coefficients in a multiple 
linear regression setup. The Bayesian approach involves the selection of a dependent 
variable and one or more explanatory variables along with prior information on the 
parameters of the models in terms of the mean vector and covariance matrix of a 
multivariate normal distribution of the regression coefficients, and the scale and shape 
parameters of the inverse gamma distribution of the error variance. For each model 
you fit in BAYESIAN, SYSTAT reports the Bayesian estimates, the mean and variance 
of the scale and shape parameters of the posterior distribution of the parameters along 
with the credible interval ( Bayesian analog of the frequentist confidence intervals) for 
the estimated regression coefficients. A plot of the prior and posterior densities of 
each of the regression parameters and variance are also obtained. 


Setup: 
* BAYESIAN 
* USE 
MODEL 
SAVE 
HOT * ESTIMATE 
MODEL command 


MODEL var = CONSTANT + vart+var2. .varn 


Specifies the linear model to estimate. CONSTANT is an o 


ptional parameter (when in 
doubt include the CONSTANT). 
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BAYESIAN: Bayesian Regression 


SAVE command 


SAVE filename 
Saves specified statistics to a filename 


/ COEF Bayesian estimates of the regression coefficients. 
RESID / DATA produces residuals, predicted values, and the data from original file. 


CONDITIONAL the conditional covariance matrix of Bayesian regression coefficients 
given sigma. 
MARGINAL the marginal covariance matrix of Bayesian regression coefficients. 


ESTIMATE command 


ESTIMATE 
Tells SYSTAT to estimate the parameters specified in the model. 
/ DIFFUSE uses diffuse priors for Bayesian estimation. 


MEAN-[b1; b2; b3;.] specifies the mean vector of the normal prior. 
or filename1 


VAR= [ matrix] or specifies the covariance matrix of the normal prior. 
filename2 

SHAPE- s1 specifies the shape parameter of the gamma prior. 

SCALE- s2 specifies the scale parameter of the gamma prior. 


CREDIBILITY = c specifies the level of credible intervals. 
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CLUSTER: 
Cluster Analysis 


CLUSTER provides three broad classes of clustering: JOIN, K-Clustering, and ADD 
trees (additive trees). JOIN comprises hierarchical linkage methods. K-Clustering 
performs the KMEANS and KMEDIANS splitting method, which is not necessarily 
hierarchical. ADD forms trees whose path (branch) lengths represents similarities 


among objects. 


JOIN clusters cases, variables, or both variables and cases simultaneously, while K- 
Clustering clusters cases only. ADD clusters a similarity or dissimilarity matrix. 
Several distance metrics are available with the JOIN and K-Clustering methods, 
including metrics for binary, quantitative variables and frequency count data. JOIN 
provides ten measures for amalgamating or linking objects and displays the results as 
a tree (dendrogram) or as a polar dendrogram. When the MATRIX option is used to 
cluster cases and variables simultaneously, SYSTAT uses a gray-scale or color 


spectrum to represent the data values. 
For each cluster in K-Clustering, KMEANS and KMEDIANS print descriptive 


Statistics for each variable and distance from each case to its center. Within each cluster 
for each variable, KMEANS and KMEDIANS also feature a profile display that shows 
how the average for the cluster departs from the mean or median of the complete 


sample. 


JOIN operates on a rectangular SYSTAT file or a file containing a symmetric 
matrix, such as those produced by CORR. ADD requires a file containing a symmetric 
matrix. You can use the CORR procedure to produce a symmetric matrix if your data 
are rectangular. K-Clustering works on the usual cases-by-variables data file only. You 


can save cluster identifiers їп а SYSTAT file with or without the input data. 


Setup: 
K means K medians 
Hierarchical Splitting splittin Additive 
tree method method meth 
* CLUSTER * CLUSTER * CLUSTER * CLUSTER 
* USE * USE * USE * USE 
IDVAR IDVAR IDVAR 
SAVE SAVE SAVE HOT * ADD 
HOT * JOIN HOT* KMEANS HOT *  KMEDIANS 


trees 


Continue by modifying commands and/or by changing JOIN or K-Clustering options. 
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CLUSTER; Cluster Analysis 


The JOIN hierarchical tree method 


JOIN command 


JOIN varlist 


Produces hierarchical tree, or linkage clustering methods. JOIN works on a rectangular 
SYSTAT file or a symmetric matrix, such as those produced with CORR. If no variables 
are specified, every numeric variable is used. ROWS is the default. 


/ ROWS 
COLUMNS 
MATRIX 


POLAR 
DISTANCE 


PERCENT 
GAMMA 
PEARSON 
RSQUARED 
EUCLIDEAN 
MINKOWSKI 
POWER=p 
CHISQUARE 


PHISQUARE 


clusters rows (cases) of data. 

clusters columns (variables) of data. 

clusters rows and columns at the same time to produce a shaded display. 
Both rows and columns are permuted to bring similar rows and columns 
next to one another. The permuted matrix is displayed using the number 
of symbols stated with SYMBOL. Also use MATRIX to see whether miss- 
ing values cluster together in a systematic pattern. 

polar coordinate plot. 

specifies the metric for measuring distance for the raw and standardized 
data. If a case has one or more values missing, the remaining values are 
used in the computations. That is, distances are computed between each 
pair of cases (or variables) by using the values that are complete for both 
numbers of the pair. Select one of these metrics: 

percentage of disagreement (or mismatches) among values (Not available 
with K-Clustering). Use with categorical or nominal scales. 

1 — (Goodman-Kruskal gamma coefficient). Use with rank order or 
ordinal scales. 

1 — (Pearson product-moment correlation coefficient). Use with 
continuous data. 

1 — (square of Pearson product-moment correlation). Use with continuous 
data. 

Euclidean distance (root mean squared distances). Use with continuous 
data. Default. 

Minkowski distance or pth root of the mean pth powered coordinated 
distances. Use POWER to set p. Use with continuous data. 

for the Minkowski distance metric. Use p=1 for city-block distance, and 
p=2 for Euclidean. 

chi-square measure for each 2 x п frequency table. Use with frequency 
count data. 

phi-square (chi-square/total) of 2 x n frequency table. Use with frequency 
count data. 
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CLUSTER: Cluster Analysis 


ABSOLUTE 


ANDERBERG 


JACCARD 


distances are computed using absolute differences. Use this metric for 
quantitative variables. The computation excludes missing values. 


clustering uses a distance metric that is the dissimilarity form of Ander- 
berg’s similarity coefficient for binary data. This distance is available for 
hierarchical clustering only. 


clustering uses a distance metric that is the dissimilarity form of Jaccard’s 


similarity coefficient for binary data. This distance is available for hierar- 
chical clustering only. 


MAHALANOBIS distances are computed using the square root of the quadratic form of 


cov 


GROUP 
RT 


RUSSEL 


SS 


POWER = real 


LINKAGE 
SINGLE 


COMPLETE 


CENTROID 
AVERAGE 


MEDIAN 
WARD 


WEIGHTED 


deviations among random vectors using the inverse of their variance- 
covariance matrix. This metric also can be used to cluster groups. Use 
this metric with quantitative variables. Missing values are excluded from 
computations. 


symmetric and positive definite matrix of order p by p ( p is the number 
of elements). 


specify the grouping variable for computing MAHALANOBIS distance. 
clustering uses a distance metric that is the dissimilarity form of Rogers 


and Tanimoto’s similarity coefficient for categorical data. This distance is 
available for hierarchical clustering only. 


clustering uses a distance metric that is the dissimilarity form of Russel’s 
similarity coefficient for binary data. This distance is available for hierar- 
chical clustering only. 


clustering uses a distance metric that is the dissimilarity form of Sneath 
and Sokal’s similarity coefficient for categorical data. This distance is 
available for hierarchical clustering only. 

real is the double value only for DISTANCE, 


selects the joining algorithm for amalgamating objects. LINKAGE speci- 
де how distances between pairs of clusters are measured. Select one of 
these rules: 


uses the distance between the two closest members. This method tends to 
produce long, stringy clusters. If you use a similarity or dissimilarity 
matrix, this is Johnson’s min method. Default. 

uses distance between the most distant pair of objects. This method tends 


to produce compact, globular clusters. If you use a similarity or dissimi- 
larity matrix, this is Johnson’s max method. 


uses the average value of all objects in a cluster (the cluster centroid) as 
the reference point for distances to other clusters, 


averages all distances between pairs of objects in different clusters. 
uses the median distance between pairs of objects in different clusters. 


uses the average value of all objects in a cluster (the cluster centroid) as 


the reference point for distances to other clusters, with adjustments for 
covariances. 


uses a weighted average distance between pairs of objects in different 


clusters to decide how far apart they are. The weights used are propor- 
tional to the size of the cluster, 
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FLEXIBETA 


UNIFORM 


KNBD 


K=k 
RADIUS=r 
BETA-b 
HEIGHT 
LEAF 
LENGTH 


PROP 
VALIDITY 

RMSSTD 

CHF 

PTS 

DB 


DUNN 
MAX 


Examples: 


CLUSTER: Cluster Analysis 


uses a weighted average distance between pairs of objects in different 
clusters to decide how far apart they are. You can specify the value of the 
weight В . The range of B is between -1 and 1. 


uniform kernel method is a density linkage method. The estimated den- 
sity is proportional to the number of cases in a sphere of radius r. A new 
dissimilarity matrix is then constructed using this density estimate. 
Finally single linkage cluster analysis is performed. 


k-th nearest neighborhood method is a density linkage method. The esti- 
mated density is proportional to the number of cases in the smallest 
sphere containing the k-th nearest neighbor. A new dissimilarity matrix is 
then constructed using this density estimate. Finally single linkage cluster 
analysis is performed. 


specifies the k to be used for k-nearest neighborhood density method. 
specifies the radius r to be used for uniform kernel density method. 
specifies the weight b for Flexible beta linkage method. 

provides the option of cutting tree at a specified height of the tree. 
provides the option of cutting tree by number of leaf nodes. 


As you pass from node to node in order down the cluster tree, the color 
changes when the length of a node on the distance scale changes between 
less than and a greater than the specified length of terminal nodes (on a 
scale of 0 to 1). 


colors are assigned based on the proportion of members in a cluster. 
provides an option for selecting an index. 

provides root-mean-square standard deviation of newly formed cluster. 
provides the Calinski and Harabasz's pseudo F. 

provides pseudo T-square statistic for cluster assessment. 


provides Davies-Boldin's index for each hierarchy of clustering. This 
index is applicable for coordinate data only. 


provides Dunn's cluster separation measure. 


performs the computation of indices upto this specified number of 
clusters. The default value is square-root of number of objects. 


JOIN AGE INCOME / ROWS 


JOIN TRIAL(1.. 4) / COLUMNS 


JOIN / MATRIX 


JOIN / MATRIX SYMBOL=4 
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CLUSTER: Cluster Analysis 


K-Means splitting method 


KMEANS command 


KMEANS varlist 


KMEANS splits cases into a selected number of groups by maximizing between-cluster 
variation and minimizing within-cluster variation. KMEANS only uses rectangular 
SYSTAT files. If no variables are specified, every numeric variable is used. 


/ NUMBER=n 
ITER=n 
DISTANCE 


GAMMA 
PEARSON 
RSQUARED 
EUCLIDEAN 
MINKOWSKI 
POWER=p 
CHISQUARE 
PHISQUARE 
ABSOLUTE 


MAHALANOBIS 


cov 
GROUP 


number of clusters. The default is 2, or one split of data. 
maximum number n of iterations. The default is 20. 


specifies the metric for measuring distance for the raw and standardized 
data. If a case has one or more values missing, the remaining values are 
used in the computations. That is, distances are computed between each 
pair of cases by using the values that are complete for both numbers of 

the pair. Select one of these metrics: 


1 —(Goodman-Kruskal gamma coefficient). Use with rank order or ordi- 
nal scales. 


1 — (Pearson product-moment correlation coefficient). Use with continu- 
ous data. 


1 — (square of Pearson product-moment correlation). Use with continu- 
ous data. 


Euclidean distance (root mean squared distances). Use with continuous 
data. Default. 


Minkowski distance or p th root of the mean p th powered coordinated 
distances. Use POWER to set p. Use with continuous data. 


for the Minkowski distance metric. Use p-1 for city-block distance, and 
p=2 for Euclidean. 


chi-square measure for each 2 x n frequency table. Use with frequency 
count data. 


phi-square (chi-square/total) of 2 x n frequency table. Use with fre- 
quency count data. 


distances are computed using absolute differences, Use this metric for 
quantitative variables. The computation excludes the missing values. 


distances are computed using the square root of the quadratic form of the 
deviations among the random vectors using the inverse of their variance- 
covariance matrix. This metric also can be used to cluster groups. Use 


this metric with quantitative variables. Missing values are excluded from 
computations. 


symmetric and positive definite matrix of order p by p (p is the number 
of elements). 


specify the grouping variable for computing MAHALANOBIS distance. 
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MW 


INITIAL 
NONE 


FIRSTK 
LASTK 
RANDOMK 


RANDSEG 


PCA 


INIFILE 
PARTITION 
HIERSEG 


LINKAGE 


RSEED 


Examples: 


CLUSTER: Cluster Analysis 


minimum within sum of squares deviations. (Not available with JOIN.) 
Use with continuous data. 

specify the initial seeds for clustering: 

starts with one cluster and splits it into two clusters by picking the case 
farthest from center as a seed for second cluster and then assigning each 
case optimally, It continues splitting and reassigning the cases until k 
clusters are formed. 

considers first k non-missing cases as initial seeds. 

considers last К non-missing cases as initial seeds. 

chooses randomly (without replacement) К non-missing cases as initial 
seeds. If desired, specify the seed for random number generation using 
RSEED command. 

assigns each case to any of k partitions randomly. Computes seeds from 
each initial partition taking mean or median of the observations which- 
ever is applicable. If desired, specify the seed for random number gener- 
ation using RSEED command. 

uses the first principal component as a single variable. Sorts all cases 
based on this single variable. It create partitions taking the first n/k cases 
in the first partition, next n/k cases in the second partition and so on. 


specify the filename from which you want to give the initial seeds. 
specify the variable for which you have initial seeds. 


makes initial К partitions from hierarchical clustering with specified link- 
age method. 

specify the linkage method for hierarchical segmentation. The linkage 
methods are AVERAGE, CENTROID, COMPLETE, MEDIAN, SINGLE, 
WARD, and WEIGHTED. 


specify the seed to be used for random number generation. 


KMEANS SCORES(1..10) 


KMEANS / NUMBER=4 
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CLUSTER: Cluster Analysis 


K-Medians splitting method 


KMEDIANS command 


KMEDIANS varlist 


KMEDIANS splits cases into a selected number of groups by maximizing between- 
cluster variation and minimizing within-cluster variation. KMEDIANS uses rectangular 
SYSTAT files. If no variables are specified, every numeric variable is used. 


/ NUMBER=n 
ITER=n 
DISTANCE 


GAMMA 
PEARSON 
RSQUARED 
EUCLIDEAN 
MINKOWSKI 
POWER=p 
CHISQUARE 
PHISQUARE 
ABSOLUTE 


MAHALANOBIS 


COV 


GROUP 


number of clusters. The default is 2, or one split of data. 
maximum number п of iterations. The default is 20. 


specifies the metric for measuring distance for the raw and standardized 
data. If a case has one or more values missing, the remaining values are 
used in the computations. That is, distances are computed between each 
pair of cases by using the values that are complete for both numbers of 

the pair. Select one of these metrics: 


1 —(Goodman-Kruskal gamma coefficient). Use with rank order or ordi- 
nal scales. 


1 — (Pearson product-moment correlation coefficient). Use with continu- 
ous data. 


1 — (square of Pearson product-moment correlation). Use with continu- 
ous data. 


Euclidean distance (root mean squared distances). Use with continuous 
data. Default. 


Minkowski distance or p th root of the mean p th powered coordinated 
distances. Use POWER to set p. Use with continuous data, 


for the Minkowski distance metric.Use p-1 for city-block distance, and 
p=2 for Euclidean, 


chi-square measure for each 2 x n frequency table.Use with frequency 
count data. 


phi-square (chi-square/total) of 2 х n frequency table.Use with frequency 
count data. 


distances are computed using absolute differences.Use this metric for 
quantitative variables. The computation excludes the missing values. 


distances are computed using the square root of the quadratic form of the 
deviations among the random vectors using the inverse of their variance- 
covariance matrix. This metric also can be used to cluster groups. Use 


this metric with quantitative variables. Missing values are excluded from 
computations. 


symmetric and positive definite matrix of order p by p (p is the number 
of elements). 


specify the grouping variable for computing MAHALANOBIS distance. 
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MW 


INITIAL 
NONE 


FIRSTK 
LASTK 
RANDOMK 


RANDSEG 


PCA 


INIFILE 
PARTITION 
HIERSEG 


LINKAGE 


RSEED 


Examples: 


CLUSTER: Cluster Analysis 


minimum within sum of squares deviations. (Not available with JOIN.) 
Use with continuous data. 

specify the initial seeds for clustering: 

starts with one cluster and splits it into two clusters by picking the case 
farthest from center as a seed for second cluster and then assigning each 
case optimally. It continues splitting and reassigning the cases until k 
clusters are formed. 

considers first К non-missing cases as initial seeds. 

considers last К non-missing cases as initial seeds. 

chooses randomly (without replacement) k non-missing cases as initial 
seeds. If desired, specify the seed for random number generation using 
RSEED command. 

assigns each case to any of k partitions randomly. Computes seeds from 
each initial partition taking mean or median of the observations which- 
ever is applicable. If desired, specify the seed for random number gener- 
ation using RSEED command. 

uses the first principal component as a single variable. Sorts all cases 
based on this single variable. It creates partitions taking the first n/k 
cases in the first partition, next n/K cases in the second partition and so 
on. 

specify the filename from which you want to give the initial seeds. 
specify the variable for which you have initial seeds. 

makes initial К partitions from hierarchical clustering with specified 
linkage method. 

specify the linkage method for hierarchical segmentation. The linkage 
methods are AVERAGE, CENTROID, COMPLETE, MEDIAN, 
SINGLE, WARD, and WEIGHTED. 

specify the seed to be used for random number generation. 


KMEDIANS SCORES(1..10) 


KMEDIANS / NUMBER=4 


Both Join and K-Clustering 


IDVAR command 


IDVAR var$ 


Specifies a character variable to label rows in output displays. 
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CLUSTER: Cluster Analysis 


SAVE command 


SAVE filename 


SAVE provides three options to save either cluster identifiers, cluster identifiers along 
with data, or final cluster seeds , to a file. (Saving final seeds is not available with JOIN 


command.) 
With JOIN, you must include NUMBER. Specify SAVE before JOIN, KMEANS or 
KMEDIANS. 
| NUMBER-n number of clusters for which identifiers are saved. The default is 2, or one 
split of data. 
DATA saves cluster indices as a new variable in the data file. 
SEEDS saves the final seeds as a new variable in the data file (available only for 


K-Clustering). 
Example: 


SAVE IDENT / NUMBER= 3 DATA SEEDS 


Additive Trees 


PLENGTH command 


PLENGTH SHORT 
MEDIUM 
LONG 


Controls the amount of output reported. To request extended results, enter PLENGTH 
prior to ADD. Select one of three categories of output: 


SHORT additive tree. 
MEDIUM SHORT output plus raw data and model distances. 
LONG MEDIUM output plus transformed data distances and residual distances. 
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ADD command 


CLUSTER: Cluster Analysis 


ADD varlist 


Forms trees whose path (branch) lengths represent similarities among objects. 


/ DATA 
TRANSFORMED 
MODEL 
RESIDUALS 
NONUMBERS 
NOSUBTRACT 


HEIGHT 
MINVAR 


Example: 


raw data matrix. 

data, after transformation into distance-like numbers. 

model (tree) distances. 

residuals matrix. 

does not number objects in the tree graph. 

specifies an additive constant. The default assumption is interval-scale 
data; this implies complete freedom in choosing an additive constant. 
Therefore, the primary approach is to either add or subtract an additive 
constant to exactly satisfy the triangle inequality: d(i,j) + d(i,k) >= d(j,k) 
for all ij,k, and d(ij) + d(i,k) = d(j,k) for some i,j,k. But if NOSUB- 
TRACT is specified, strict inequality is allowed; that is, if d(i,j) + d(i,k) > 
d(j,k) holds for all i,j,k in the data, no constant is subtracted. 

requests printing of the distance of each node from the root. 

combines the last few remaining clusters into the root node. If MINVAR 
is specified, the program will search for the root that minimizes the vari- 
ance of the distances from the root to the leaves. 


ADD MUMBAI PARIS / DATA MODEL 
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CONJOINT: 
Conjoint Analysis 


The CONJOINT module fits metric and nonmetric conjoint measurement models to 
observed data. It is designed to be a general additive model program using a simple 
optimization procedure. As such, CONJOINT can handle measurement models not 
normally amenable to other specialized conjoint programs. 


Setup: 
* CONJOINT 
li USE 
hs MODEL 
SAVE 
HOT * ESTIMATE 
MODEL command 
MODEL depvarlist = indvarlist 
Specifies the additive model. 
Example: 
MODEL RESPONSE = DESIGN$..GUARANT$ 
SAVE command 


SAVE filename 


Saves specified statistics to a file. 
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ESTIMATE command 


CONJOINT: Conjoint Analysis 


ESTIMATE 


Determines estimation methods. 


| ITERATIONS-n The default is 50. 
CONVERGE-d sets the convergence criterion. This is the largest relative 
change in any coordinate before iterations terminate. The 
default is 0.00001. 
LOSS= STRESS maximizes Kendall’s tau-b or minimizes Kruskal’s 
TAU STRESS directly. The default is STRESS. t 
REGRESSION- LINEAR the regression form may be linear (comparable to a linear 
MONOTONIC regression model except that the model need not be full 
LOG rank), monotonic (if LOSS-STRESS, this is Kruskal's 
POWER MONANOVA model), log, or power (useful for Box-Cox 


models). The default is MONOTONIC. 


POLARITY- POSITIVE specifies the polarity of the preferences when doing pref- 
NEGATIVE erence mapping. If the smaller number indicates the least 


and the higher num 


ber the most, POLARITY=POSITIVE. 


For example, a questionnaire may include the question 
“Rate a list of movies where one star is the worst and five 
stars is the best.” If the higher number indicates a lower 


ranking and the lower number indicates a higher ranking, 
POLARITY=NEGATIVE. For example, a questionnaire 
may include the question “Rank your favorite sports team 
where 1 is the best and 10 is the worst.” 


Example: 


ESTIMATE / REGRESSION=POWER POLARITY=NEGATIVE 
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CORAN: Correspondence Analysis 


CORAN: 
Correspondence Analysis 


Setup: 


The CORAN module computes simple and multiple correspondence analysis for two- 
way and multi-way tables of categorical variables. 


* CORAN 
Y USE 
= MODEL 
SAVE 
HOT * ESTIMATE 


MODEL command 


MODEL varlist 
MODEL depvarlist = indvarlist 


Specifies the variables to be fit. If there is only one varlist, it is a multiple 
correspondence model. If there is a dependent variable and an independent variable, it 
is a simple correspondence model. 


Examples: 


MODEL GROUP, AGE,SEX 
MODEL ROW = COL 


SAVE command 


SAVE filename 


Saves coordinates for the row and column variables to a file. For simple 
correspondence analysis, row coordinates appear іп D/M(1 )...DIM(N) and column 
coordinates appear in FACTOR(I)...FACTOR(N), where the subscript denotes the 
dimension number. Labeling information is saved to LABELS. For multiple 
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CORAN: Correspondence Analysis 


correspondence analysis, variable coordinates appear in D/M(1 )...РІМ(№) and case 
coordinates appear in FACTOR(1 )...FACTOR(N). 


ESTIMATE command 


ESTIMATE 


Causes the model to be fit. 
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CORR: 
Correlations, Associations, and Distance Measures 


CORR computes correlations and measures of similarity and distance. It prints the 
resulting matrix and, if requested, saves it ina SYSTAT file for further analysis, such 
as multidimensional scaling, cluster, or factor analysis. As a Quick Graph, CORR 
includes a SPLOM (a scatterplot matrix) where the data in each plot correspond to 
those used to compute a value in the matrix. 

For continuous data, CORR provides the Pearson correlation, covariances, and sum 
of squares of deviations from the mean and sum of cross-products of deviations 
(SSCP). In addition to the usual probabilities, the Bonferroni and Dunn-Sidak 
adjustments are available with Pearson correlations. If distances are desired, Euclidean 
or city-block distances are available. Similarity measures for continuous data include 
the Bray-Curtis coefficient and the QSK quantitative symmetric coefficient (or 
Kulczynski measure). 

For rank order data, CORR provides Goodman-Kruskal’s gamma, Guttman’s mu2, 
Spearman's rho, Kendall’s tau b and Stuart's tau c. 

For unordered data, CORR provides Phi coefficient, Cramer's V, Contingency 
coefficient, Goodman-Kruskal's lambda (symmetric measure) and Uncertainty 
coefficient (symmetric measure). 

For binary data, CORR provides S2, the positive matching dichotomy coefficient; 
S3, Jaccard's dichotomy coefficient; S4, the simple matching dichotomy coefficient; 
S5, Anderberg's dichotomy coefficient; S6, Tanimoto's dichotomy coefficient; S7, 
Anderberg's binary similarity coefficient; Yule's О coefficient; Hamman's binary 
similarity coefficient; Dice's binary similarity coefficient; Sneath and Sokal's binary 
similarity coefficient; Ochiai's binary similarity coefficient; Kulczynski's binary 
similarity coefficient; and Gower2 binary similarity coefficient. When underlying 
distributions are assumed to be normal, the tetrachoric correlation is available. 

When data are missing, listwise and pairwise deletion methods are available for all 
measures. An EM algorithm is an option for maximum likelihood estimates of 
correlation, covariance, and cross-products of deviations matrices. For robust ML 
estimates where outliers are down weighted, the user can specify the degrees of 
freedom for the t distribution or contamination for a normal distribution. CORR 
includes a graphical display of the pattern of missing values. Little's MCAR test is 
printed with the display. EM also identifies cases with extreme Mahalanobis distances. 

Hadi's robust outlier detection and estimation procedure is an option for 
correlations, covariances, and SSCP—cases identified as outliers by the procedure are 
not used to compute estimates. 
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Setup 
* CORR 
x USE 
PLENGTH 
SAVE 
SAMPLE 
HOT Ы PEARSON or SPEARMAN or ВС or QSK or TETRA or GAMMA or 
MU2 or TAUB or TAUC or COVARIANCE or SSCP or EUCLIDEAN 
or CITY or CONT or LAMBDA or PHI or UNCE or CRAMER or 
YULEQ or S2 or S3 or S4 or S5 or S6 or HAMMAN or DICE or S7 
or SNEATH or OCHIAI or KULCZY or GOWER 
Continue by changing options or by selecting a default measure. 
Note: To request extended results, enter the command PLENGTH LONG prior to a HOT 
command. 
PLENGTH command 


PLENGTH SHORT 
MEDIUM 
LONG 


Controls the amount of output reported. Select one of two categories of output: 


SHORT correlation matrix 
short output plus mean of each variable. In addition, for EM estima- 
MEDIUM/LONG tion, iteration history, missing value patterns, Little's MCAR test and 
mean estimates are also reported. 


152 
CORR: Correlations, Associations, and Distance Measures 


SAVE command 

SAVE filename 

| TYPE= RECTANGUALR 
SSCP 
COVARIANCE 
CORRELTATION 
DISSIMILARITY 
SIMILARITY 


defines the structure of the saved file. RECTANGUALR corresponds to standard cases- 
by-variables files. The other types denote matrix files. 


Saves the next matrix you request. If you request a rectangular section of a matrix 


(rowlist * collist), SY STAT saves the union of the row and column variables (a triangle), 
not just the rectangle of measures that are printed. 


SAMPLE command 


SAMPLE BOOT (m.n) 
SIMPLE (m,n) 
JACK 


The argument m is the number of samples; the argument n is the size of each sample. 


For getting the summarized resampling output, the above command should be given 
before the hot command, indicating the type of correlation. SYSTAT gives a summary 
based on resampling for Pearson, Spearman, Gamma, Tau b, and MU2 correlation 
coefficients. 


| CONFI-c specifies a confidence level for bootstrap-based confidence interval. 
The default value is 0.95. 


Examples: 


SAMPLE JACK 


PEARSON SALBEG SALNOW 
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Correlations, Covariances, and SSCP 


PEARSON command 


PEARSON variist 


or 
PEARSON rowlist *collist 


Computes Pearson correlations for all variables in varlist or just the pairs of variables 
defined by the row and column lists. If you omit the argument, all numeric variables 


are used. 


| BONF 


DUNN 
PROB 


EM 
T=df 


NORMAL-nf1, n2 


ITER=n 
CONV=n 


HADI 
TOL=n 


LISTWISE 


PAIRWISE 


Example: 


Bonferroni adjustment to the probability that allows for multiple test- 
ing—that is, use these probabilities when scanning a matrix for signifi- 
cant correlations. 

Dunn-Sidak adjustment to the probability similar to that of the Bonfer- 
roni. 

the usual probability associated with a single correlation coefficient. 
Appropriate only if you select, a priori, a specific correlation to report. 
maximum likelihood expectation maximization method. 
when data are missing, produces maximum likelihood estimates for at 
distribution sample where df is the degrees of freedom. The default is 5. 
maximum likelihood estimates for contaminated multivariate normal 
samples where 17 is the probability of contamination and n2 is the vari- 
ance of the contamination. 

maximum number of iterations for computing the estimates. The default 
is 20. 

convergence criterion. If the relative change of covariance entries are all 


less than this value, then convergence is assumed. The default is 0.001. 

Hadi's robust estimation procedure and also identify outliers. 

variables are not used whose R? (with the other variables) is greater than 
(1.0-л). The default is 0.0001. 

if any selected variable has a value missing; the case is omitted from the 
computation. 

for each entry in the matrix, cases are used that have both values present. 


PEARSON AGE INCOME EDUC / BONF PAIRWISE 
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COVARIANCE command 


COVARIANCE varlist 


or 
COVARIANCE rowlist *collist 


Computes a covariance matrix for all variables in varlist or just the pairs of variables 
defined by the row and column lists. If you omit the argument, all numeric variables 


are used. 
| EM 
T=df 


NORMAL»-nf1, n2 


ITER=n 
CONV=n 
HADI 
TOL=n 
LISTWISE 


PAIRWISE 


Example: 


maximum likelihood expectation maximization method. 


when data are missing, produces maximum likelihood estimates for a 
t distribution sample where df is the degrees of freedom. The default 
is 5. 

maximum likelihood estimates for contaminated multivariate normal 
samples where n1 is the probability of contamination and n2 is the 
variance of the contamination. 


maximum number of iterations for computing the estimates. The 
default is 20. 

convergence criterion. If the relative change of covariance entries are 
all less than this value, then convergence is assumed. The default is 
0.001. 

Hadi's robust estimation procedure and also identify outliers. 


variables are not used whose R? (with the other variables) is greater 
than (1.0-n). The default is 0.0001. 


if any selected variable has a value missing; the case is omitted from 
the computation. 


for each entry in the matrix, cases are used that have both values 
present. 


COVARIANCE AGE INCOME EDUC 
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SSCP command 


SSCP varlist 

or 

SSCP rowlist *collist 

Computes sum of squares of deviations from the mean and cross-products of 


deviations for all variables in varlist or just the pairs of variables defined by the row and 
column lists. If you omit the argument, all numeric variables are used. 


ГЕМ maximum likelihood expectation maximization method. 
T=df when data are missing, produces maximum likelihood estimates for a t 
distribution sample where df is the degrees of freedom. The default is 
S 


NORMAL=n1,n2 maximum likelihood estimates for contaminated multivariate normal 
samples where n1 is the probability of contamination and n2 is the 
variance of the contamination. 


ITER=n maximum number of iterations for computing the estimates. The 
default is 20. 

CONV=n convergence criterion. If the relative change of covariance entries are 
all less than this value, then convergence is assumed. The default is 
0.001. 

HADI Hadi’s robust estimation procedure and also identify outliers. 

TOL=n variables are not used whose R? (with the other variables) is greater 
than (1.0-n). The default is 0.0001. 

LISTWISE if any selected variable has a value missing; the case is omitted from 
the computation. 

PAIRWISE for each entry in the matrix, cases are used that have both values 
present. 

Example: 


SSCP AGE INCOME EDUC / EM 
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Distance and similarity measures for continuous data 


BC command 


BC varlist 
or 
BC rowlist *collist 


Computes Bray-Curtis similarity measures for the continuous data in varlist or just the 
pairs of variables defined by the row and column lists. If you omit the argument, all 
numeric variables are used. 


/ LISTWISE X ifany selected variable has a value missing; the case is omitted from the 


computation. 
PAIRWISE for each entry in the matrix, cases are used that have both values present. 


Example: 


BC AGE INCOME EDUC 


QSK command 


QSK varlist 

or 

QSK rowlist *collist 

Computes quantitative symmetric similarity coefficients in varlist or just the pairs of 
variables defined by the row and column lists. If you omit the argument, all numeric 
variables are used. 

/ LISTWISE if any selected variable has a value missing; the case is omitted from the 


computation. 
PAIRWISE for each entry in the matrix, cases are used that have both values present. 


Example: 


QSK AGE INCOME EDUC 
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EUCLIDEAN command 


EUCLIDEAN varlist 
or 
EUCLIDEAN rowlist *collist 


Computes a normalized Euclidean distance (root-mean-squared differences) matrix for 
all the variables in varlist or just the pairs of variables defined by the row and column 
lists. If you omit the argument, all numeric variables are used. 


/ LISTWISE if any selected variable has a value missing; the case is omitted from the com- 
putation. 
PAIRWISE ог each entry in the matrix, cases аге used that have both values present. 


Example: 


EUCLIDEAN AGE INCOME EDUC 


CITY command 


CITY varlist 


or 
CITY rowlist *collist 


Computes city-block distances (sum of absolute discrepancies) for all variables in 
varlist or just the pairs of variables defined by the row and column lists. If you omit the 


argument, all numeric variables are used. 
/ LISTWISE if any selected variable has a value missing, the case is omitted from the com- 


putation. 
PAIRWISE for each entry in the matrix, cases are used that have both values present. 


Example: 


CITY AGE INCOME EDUC / PAIRWISE 
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Measures for rank-order data 


SPEARMAN command 


SPEARMAN varlist 
or 
SPEARMAN rowlist *collist 


Computes Spearman rank-order correlations for all variables in varlist or just the pairs 
of variables defined by the row and column lists. If you omit the argument, all numeric 
variables are used. 


I LISTWISE if any selected variable has a value missing, the case is omitted from the com- 


putation. 
PAIRWISE for each entry in the matrix, cases are used that have both values present. 


Example: 


SPEARMAN SCORE(1 .. 15) 


GAMMA command 


GAMMA varlist 
or 
GAMMA rowlist *collist 


Computes Goodman-Kruskal's gamma coefficients for ordered categories for all 
variables in varlist or just the pairs of variables defined by the row and column lists. If 
you omit the argument, all numeric variables are used. 


/ LISTWISE if any selected variable has a value missing, the case is omitted from the com- 


putation. 
PAIRWISE for each entry in the matrix, cases are used that have both values present. 


Example: 


GAMMA SCORE(1 .. 15) 
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МО? command 


MU2 varlist 


or 
MU2 rowlist *collist 


Computes Guttman mu2 monotonicity coefficients for all variables іп variist or just the 
pairs of variables defined by the row and column lists. If you omit the argument, all 
numeric variables are used. 


/ LISTWISE if any selected variable has a value missing, the case is omitted from the com- 
putation. 
PAIRWISE for each entry in the matrix, cases are used that have both values present. 


Example: 


MU2 SCORE(1 .. 15) 


TAUB command 


TAUB varlist 


or 
TAUB rowlist *collist 


Computes Kendall tau b coefficients for all variables in varlist or just the pairs of 
variables defined by the row and column lists. If you omit the argument, all numeric 
variables are used. 

/ LISTWISE if any selected variable has a value missing, the case is omitted from the 


computation. 
PAIRWISE for each entry in the matrix, cases are used that have both values present. 


Example: 


TAUB SCORE(1 .. 15) 
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TAUC command 


TAUC varlist 


or 
TAUC rowlist *collist 


Computes Stuart's tau c coefficients for ordered categories for all variables in varlist or 
just for the pairs of variables defined by the row and column lists. 


/ LISTWISE if any selected variable has a value missing, the case is omitted from the 
computation. 


PAIRWISE for each entry in the matrix, cases are used that have both values present. 


Example: 


TAUC SCORE(1 .. 15) 


Measures for unordered data 


PHI command 


PHI varlist 


or 
PHI rowlist *collist 


Computes phi coefficients for all variables in varlist or just the pairs of variables 
defined by the row and column lists. 


/ LISTWISE if any selected variable has a value missing, the case is omitted from the com- 


putation. 
PAIRWISE for each entry in the matrix, cases are used that have both values present. 


Example: 


PHI CATEGORY1 CATEGORY2 CATEGORY3 
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CONT command 


CONT varlist 


or 
CONT rowlist *collist 


Computes contingency coefficients for all variables in varlist or just for the pairs of 
variables defined by the row and column lists. 


/ LISTWISE if any selected variable has a value missing, the case is omitted from the com- 


putation. 
PAIRWISE for each entry in the matrix, cases are used that have both values present. 


Example: 


CONT CATEGORY1 CATEGORY2 CATEGORY3 


LAMBDA command 


LAMBDA varlist 
or 
LAMBDA rowlist *collist 


Computes symmetric lambda coefficients for all variables in varlist or just for the pairs 
of variables defined by the row and column lists. If you omit the argument, all 
categorical variables are used. 

/ LISTWISE if any selected variable has a value missing, the case is omitted from the com- 


putation. 
PAIRWISE for each entry in the matrix, cases are used that have both values present. 


Example: 


LAMBDA CATEGORY1 CATEGORY2 CATEGORYS 
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UNCE command 


UNCE varlist 


or 
UNCE rowlist *collist 


Computes symmetric uncertainty coefficients for all variables in varlist or just for the 
pairs of variables defined by the row and column lists. 


/ LISTWISE if any selected variable has a value missing, the case is omitted from the com- 
putation. 
PAIRWISE for each entry in the matrix, cases are used that have both values present. 


Example: 


UNCE CATEGORY1 CATEGORY2 CATEGORY3 


CRAMER command 


CRAMER varlist 
or 
CRAMER rowlist *collist 


Computes Cramer's V coefficients for all variables in varlist or just for the pairs of 
variables defined by the row and column lists. 


/ LISTWISE if any selected variable has a value missing, the case is omitted from the com- 
putation. 
PAIRWISE for each entry in the matrix, cases are used that have both values present. 


Example: 


CRAMER CATEGORY! CATEGORY2 CATEGORY3 
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Measures for binary data 


TETRA command 


TETRA varlist 
or 
TETRA rowlist *collist 


Computes tetrachoric correlations in varlist or just the pairs of variables defined by the 
row and column lists. If you omit the argument, all numeric variables are used. 


/ LISTWISE if any selected variable has a value missing, the case is omitted from the com- 


putation. 
PAIRWISE for each entry in the matrix, cases are used that have both values present. 


Example: 


TETRA SCORE(1 .. 15) 


YULEQ command 


YULEQ varlist 
or 
YULEQ rowlist *collist 


Computes Yule’s Q coefficients for ordered categories for all variables in varlist or just 
for the pairs of variables defined by the row and column lists. 


/ LISTWISE if any selected variable has a value missing, the case is omitted from the com- 


putation. 
PAIRWISE for each entry in the matrix, cases are used that have both values present. 


Example: 


YULEQ ITEM1 ITEM2 ITEM3 
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HAMMAN command 


HAMMAN variist 
or 
HAMMAN rowlist *collist 


Computes Hamman’s coefficients for ordered categories for all variables in varlist or 
just for the pairs of variables defined by the row and column lists. 


/ LISTWISE if any selected variable has a value missing, the case is omitted from the com- 


putation. 
PAIRWISE for each entry in the matrix, cases are used that have both values present. 


Example: 


HAMMAN ІТЕМ1 ITEM2 ITEM3 


DICE command 


DICE varlist 


or 
DICE rowlist *collist 


Computes Dice's coefficients for ordered categories for all variables in varlist or just 
for the pairs of variables defined by the row and column lists. 


/ LISTWISE if any selected variable has a value missing, the case is omitted from the com- 


putation. 
PAIRWISE for each entry in the matrix, cases are used that have both values present, 


Example: 


DICE ITEM1 ITEM2 ITEM3 


165 


CORR: Correlations, Associations, and Distance Measures 


S7 command 


S7 varlist 


or 
S7 rowlist *collist 


Computes Anderberg's (57) coefficients for ordered categories for all variables in 
varlist or just for the pairs of variables defined by the row and column lists. 


/ LISTWISE if any selected variable has a value missing, the case is omitted from the com- 


putation. 
PAIRWISE for each entry in the matrix, cases are used that have both values present. 


Example: 


S7 ITEM1 ITEM2 ITEM3 


SNEATH command 


SNEATH varlist 
or 
SNEATH rowlist *collist 


Computes Sneath's coefficients for ordered categories for all variables in varlist or just 
for the pairs of variables defined by the row and column lists. 


/ LISTWISE if any selected variable has a value missing, the case is omitted from the com- 


putation. 
PAIRWISE for each entry in the matrix, cases are used that have both values present. 


Example: 


SNEATH ITEM1 ITEM2 ITEM3 
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OCHIAI command 


OCHIAI varlist 
or 
OCHIAI rowlist *collist 


Computes Ochiai's coefficients for ordered categories for all variables in varlist or just 
for the pairs of variables defined by the row and column lists. 


/ LISTWISE if any selected variable has a value missing, the case is omitted from the com- 


putation. 
PAIRWISE for each entry in the matrix, cases are used that have both values present. 


Example: 


OCHIAI ITEM1 ITEM2 ITEM3 


KULCZY command 


KULCZY varlist 
or 
KULCZY rowlist *collist 


Computes Kulezy's coefficients for ordered categories for all variables in varlist or just 
for the pairs of variables defined by the row and column lists. 


/ LISTWISE if any selected variable has a value missing, the case is omitted from the com- 


putation. 
PAIRWISE for each entry in the matrix, cases are used that have both values present. 


Example: 


KULCZY ITEM1 ITEM2 ITEM3 
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GOWER command 


GOWER varlist 
or 
GOWER rowlist *collist 


Computes Gower's coefficients for ordered categories for all variables in varlist or just 
for the pairs of variables defined by the row and column lists. 


/ LISTWISE if any selected variable has a value missing, the case is omitted from the com- 


putation. 
PAIRWISE for each entry in the matrix, cases are used that have both values present. 


Example: 


GOWER ITEM1 ITEM2 ITEM3 


Dichotomy coefficients commands 


S2 or S3 or S4 or S5 orS6 varlist 
or 
S2 or S3 or S4 or S5 or S6 rowlist *collist 


Computes dichotomous coefficients for all variables in varlist or just the pairs of 
variables defined by the row and column lists. If you omit the argument, all numeric 
variables are used. S2 is the positive matching dichotomy coefficient; S3, Jaccard's 
dichotomy coefficient; $4, the simple matching dichotomy coefficient; S5, 
Anderberg's dichotomy coefficient; and S6, Tanimoto's dichotomy coefficient. 


/ LISTWISE if any selected variable has a value missing, the case is omitted from the com- 


putation. 
PAIRWISE for each entry in the matrix, cases are used that have both values present. 


Example: 


S4 ITEM1 ITEM2 ITEM3 
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DESIGN: 
Design of Experiments 


DESIGN generates design matrices for complete factorial and fractional factorial 
designs, including two-level designs recommended by Box and Hunter (up to 11 
factors), Latin square designs (up to 12 levels), 13 Taguchi designs, Plackett-Burman 
designs (specific designs range from 2 to 7 levels for up to 40 factors), and Box and 
Behnken with optional blocking factors. Four types of mixture models described by 
Cornell (1990) are available: Lattice, Centroid, Axial, and Screening designs. You can 
save the design in a file and later add values of the dependent variable to the file for 
analysis of the experimental results. 


Setup: 
* DESIGN 
PLENGTH 
SAVE 
HOT * FACTORIAL or BOXHUNTER or LATIN or TAGUCHI 
or PLACKETT or BOXBEHNKEN or MIXTURE 
PLENGTH command 


PLENGTH NONE 
SHORT 
LONG 


For Box-Hunter designs, using PLENGTH LONG in Classic DOE yields a listing of the 
generators (confounded effects) for the design. PLENGTH LONG prior to TAGUCHI, 
gives a table of interaction equivalents. 
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DESIGN: Design of Experiments 


SAVE filename 


Saves the design in a file. Specify prior to FACTORIAL, BOXHUNTER, LATIN, 
TAGUCHI, PLACKETT, BOXBEHNKEN, or MIXTURE. If you do not want to view the 
design or print it, specify PLENGTH NONE before the design request. 


FACTORIAL command 


FACTORIAL 


Creates a complete factorial design with either two or three levels per factor. 


/ LEVELS = 2 or 3 number of levels per factor. The default value is 2. 


FACTORS =n 
REPS = п 


LETTERS 
RAND 


Example: 


number of factors. For two-level designs, specify 1 < п < 8. For three- 
level designs, specify 1 « n « 6. The default value is 2. 


number of replications of the design. The default value is 1. 
labels the factors with letters instead of numbers (which is the default). 
randomizes the runs (cases). 


FACTORIAL / LEVELS=3 FACTORS-5 


BOXHUNTER command 


BOXHUNTER 


Creates two-level fractional factorial designs recommended by Box and Hunter. 


| FACTORS = п number of factors. Specify 1 < n < 12. The default value is 2. 


RUNS =n 


REPS = п 
LETTERS 
RAND 


number of runs. Specify 3 < n < 129 and be sure n is an integer power of 2 
and that it exceeds the number of factors by 1. Note that some 
combinations of RUNS and FACTORS can result in complete (and 
perhaps replicated) factorial designs as opposed to the incomplete designs. 


number of replications of the design. The default value is 1. 
labels the factors with letters instead of numbers (which is the default). 


randomizes the runs (cases). 


170 


DESIGN: Design of Experiments 


Example: 


BOXHUNTER / FACTORS=4 RUNS=5 


LATIN command 


LATIN 


Creates Latin square designs. 


/ LEVELS = п 
SQUARE 


REPS =n 
LETTERS 


RAND 


Example: 


number of levels for each of the three factors. Specify 2 « n < 13. 


displays the Latin square. The default display shows runs-by-factors. If 
you save the design, it is saved in runs-by-factors format. 


number of replications of the design. The default value is 1. 


labels the factors with letters instead of numbers (the default) when you 
display the design in runs-by-factors format or when you save it. 


randomly permutes the rows and columns of the Latin Square. In addition, 
for four-level designs, a random selection of one of four possible standard 
squares is made prior to permutations. (For three-level designs, only one 
standard square exists.) 


LATIN / SQUARE LETTERS 
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TAGUCHI command 


DESIGN: Design of Experiments 


TAGUCHI 


Creates a Taguchi design defined by TYPE. 


| TYPE = 


REPS =n 
LETTERS 


RAND 


Example: 


L4 
L8 
L9 
L12 
L16 
LP16 
L18 
L25 
L27 
L32 
LP32 
L36 
L54 


4 runs, 3 factors, 2 levels each 

8 runs, 7 factors, 2 levels each 

9 runs, 4 factors, 3 levels each 

12 runs, 11 factors, 2 levels each 

16 runs, 15 factors, 2 levels each 

16 runs, 5 factors, 4 levels each 

18 runs, 1 two-level factor, 7 three-level factors 
25 runs, 6 factors, 5 levels each 

27 runs, 13 factors, 3 levels each 

32 runs, 31 factors, 2 levels each 

32 runs, 1 two-level factor, 9 four-level factors 
36 runs, 11 two-level factors, 12 three-level factors 
54 runs 1 two-level factor, 25 three-level factors 


number of replications of the design. The default value is 1. 


labels the factors with letters instead of numbers (which is the default). If you 
specify PLENGTH LONG prior to TAGUCHI, you get a table of interaction 
equivalents. 


randomizes the runs (cases). 


TAGUCHI / TYPE=L12 LETTERS 
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PLACKETT command 


PLACKETT 
Creates Plackett-Burman designs. To define the specific design, use the RUNS option. 


/RUNS= n when each factor has two levels, specify 4 <= 4n <= 100 for one fewer fac- 


tors than runs. 
9 4 factors each with 3 levels 
27 13 factors each with 3 levels 
81 40 factors each with 3 levels 
25 6 factors each with 5 levels 
125 31 factors each with 5 levels 
49 8 factors each with 7 levels 


REPS =n number of replications of the design. The default value is 1. 
LETTERS labels the factors with letters instead of numbers (which is the default). 
RAND randomizes the runs (cases). 

Example: 


PLACKETT / RUNS=81 LETTERS 


BOXBEHNKEN command 


BOXBEHNKEN 
Creates Box and Behnken designs. 


/ FACTORS= 3 no blocking possible 
4 3 blocks of 9 cases 

5 2 blocks of 23 cases 
6 2 blocks of 27 cases 
7 2 blocks of 31 cases 
9 5 blocks of 26 cases 
10 2 blocks of 85 cases 
11 no blocking possible 
12 2 blocks of 102 cases 
16 6 blocks of 66 cases 
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BLOCK 
LETTERS 
RAND 
REPS =n 


Example: 


DESIGN: Design of Experiments 


produces the number of blocks shown under FACTORS. 

labels the factors with letters instead of numbers (which is the default). 
randomizes the runs (cases). 

number of replications of the design. The default value is 1. 


BOXBEHNKEN / FACTORS=4 BLOCK 


MIXTURE command 


MIXTURE 


Creates any of the four types of mixture models. 


/ TYPE = LATTICE 
CENTROID 
AXIAL 
SCREEN 


FACTORS =n 
LEVELS =n 


LETTERS 


RAND 
REPS =n 


Example: 


four types of mixture models. 


number of possible mixture components. The default is 3. 


number of values that each component (factor) assumes, 
including 0 and 1. Applies only to LATTICE models. 

The default is 4. For the other mixture models, the number of 
factors determines the number of levels. 

labels the factors with letters instead of numbers (which is the 
default). 

randomizes the runs (cases). 

number of replications of the design. The default value is 1. 


MIXTURE / LEVELS=3 FACTORS=5 
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DISCRIM: 
Classical Discriminant Analysis 


DISCRIM provides linear or quadratic functions of the variables that “best” separate 
cases into two or more predefined groups. The variables in the linear function can be 
selected in a forward or backward stepwise manner, either interactively by the user or 
automatically by SYSTAT. For the latter at each step, SYSTAT enters the variable that 
contributes most to the separation of the groups (or removes the variable that is least 
useful). Contrasts that emphasize the difference between specific groups can be used 
to guide variable selection. Cases can be classified even if they are not used in the 
computations. 

Print options allow the user to select panels of output to display including group 
means, variances, covariances, and correlations. Available at each step are 
discriminant functions, F-ratios for entering or removing variables, Wilks’ lambda or 
U statistics (with an approximate F. -ratio), F-ratios for pairwise differences between 
group means, discriminant functions, classification and jackknife classification 
matrices with percent correct classification. For each case, the user can request 
posterior probabilities for its assignment to each group, Mahalanobis’ distances to the 
centroid of each group, and canonical variable scores. Coefficients for canonical 
variables in the original units or standardized data are available. Users can also request 
eigenvalues, canonical correlations, and the Lawley-Hotelling and Pillai traces with 
their associated approximate F-ratios. Users can save posterior probabilities, 
Mahalanobis’ distances, and canonical variable scores. 
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Setup: 
HOT 
HOT 
HOT 


MODEL command 


DISCRIM: Classical Discriminant Analysis 


DISCRIM 


USE 
MODEL 
CONTRAST 
PLENGTH 
SAVE 
START or ESTIMATE 
STEP 
PLENGTH 
STEP 
(series of STEP commands) 
STOP 


MODEL grpvar = varlist 


Specifies the model to estimate. 


/ PRIORS=EQUAL 


SAMPLE 
size of that group. 
n1,n2, ...nk 
respective categories. 
QUADRATIC 


Example: 


MODEL SPECIES$ = PETALWID PETALLEN SEPALWID / PRIORS=.25, .50, .25 


ysis is performed. 


assigns equal prior probabilities to the group. 
computes and assigns prior probabilities proportional to the sample 


prior probabilities of group membership listed in the order of their 


quadratic discriminant analysis. If omitted, linear discriminant anal- 
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CONTRAST command 


CONTRAST [matrix] 


Specifies h sets of contrast coefficients to guide the discrimination. The default is 
h = ng-1, where ng denotes the number of groups. For example, the contrast matrix for 
ng = 5 is: 


CONTRAST [1-1 0 0 0;1 1-20 0;1 1 1-30;1 1 1 1-4] 


Be sure to enclose the matrix within square brackets and to use a semicolon (;) at the 
end of each row of the matrix—if necessary, use a comma to continue one row of a 
matrix to the next line. Multiple matrix rows can be entered as one row per line or 
written as one line: 


CONTRAST [-1,0,1;1,-2,1] 


/ CLEAR clears any previously specified contrast matrix. 


PLENGTH command 


PLENGTH NONE 
SHORT 
MEDIUM 
LONG 


Controls the amount of output reported and requests specific output panels to print. 
Optionally, include one or more panels not provided with your argument specification. 


The argument SHORT requests the following seven features: 


/ FMATRIX between groups F matrix 
FSTATS F-to-enter/remove statistics (linear model only) 
EIGEN eigenvalues and canonical correlations 
CMEANS canonical scores for group centroids 
SUMMARY a summary of the variable moved at each step in stepwise analyses 
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MEDIUM requests the output for SHORT above plus the following six features: 


/ MEANS 
WILKS 
CFUNC 
TRACES 
CDFUNC 
SCDFUNC 
CLASS 
JCLASS 


group frequencies and means 

Wilks’ lambda with approximate F 

coefficients for the classification functions 
Lawley-Hotelling, Pillai, and Wilks” trace 
coefficients for canonical variables 
standardized coefficients for canonical variables 
classification matrix 

jackknifed classification matrix 


LONG requests the output for SHORT and MEDIUM above plus the following 


features: 
/ WCOV pooled within-group covariance matrix 
WCOR pooled within-group correlation matrix 
TCOV total covariance matrix 
TCOR total correlation matrix 
GCOV groupwise covariance matrices (quadratic model only) plus test for equality 
of covariance matrices 
GCOR groupwise correlation matrices (quadratic model only) 
MAHAL Mahalanobis distances, posterior probabilities, and canonical scores for each 
case must be specified individually. 
CSCORE canonical scores for each case 
Examples: 
PLENGTH MEDIUM / MAHAL 


PLENGTH NONE / CMEANS FMATRIX JCLASS SCDFUNC 
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SAVE command 


SAVE filename 
Saves specified statistics to a file. 


/ DATA the data (with transformations created in the current run) 


SCORES for each case, includes canonical variable scores. You can request DATA, DIS- 
TANCES, or SCORES with DATA, or DISTANCES with DATA, but not 
SCORES and DISTANCES together. 

DISTANCES for each case, include the Mahalanobis distances from it to each group cen- 
troid and the posterior probability of its membership in each group. 


ESTIMATE command 


ESTIMATE 
Tells SYSTAT to estimate the analysis specified in MODEL. 


/ TOL=n the matrix inversion tolerance limit. The default is 0.001 


START command 


START 


Tells SYSTAT to produce preliminary information for a stepwise model building 
process (use STEP to continue). 
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~ 


FORWARD initializes DISCRIM for forward stepping. SYSTAT reports results for 
Step 0. If omitted, all variables in MODEL are entered/removed from the 
model that pass FENTER/FREMOVE limits. 


BACKWARD initializes DISCRIM for backward stepping 

TOL=n the matrix inversion tolerance limit, The default is 0.001 
ENTER=p probability of F-to-enter limit 

REMOVE-p probability of F-to-remove limit 


FENTER=n F-to-enter limit. Variables with F>n are entered into the model if TOL per- 
mits, The default is 4.0. 


FREMOVE=n  F-to-remove limit. Variables with F<n are removed. The default is 3.9. 
FORCE=n forces the first n variables in MODEL into the model. The default is 0. 


Examples: 


START / BACKWARD 


START / FENTER=10 FREMOVE=9.9 


STEP command 


STEP no argument 
+ 


varlist — 
nvari,nvarj, ... 


Interactively selects a variable to enter or remove from the model at each step or 
automatically lets SYSTAT select candidate variables to move. For automatic 
stepping, specify no argument (values of FENTER and FREMOVE guide the selection) 
and include AUTO in the option list. For interactive stepping, specify no argument or 
any of the other four arguments, repeating STEP for each step. Specify plus (+) to enter 
the variable with the largest F-to-enter value (irrespective of the FENTER limit); minus 
(>) to remove the variable with the smallest F-to-remove value (irrespective of the 
FREMOVE limit); specify names of one or more variables (varlist) to move in (or out) 
of the model (irrespective of FENTER and FREMOVE limits); or specify index numbers 
of one or more variables (nvari, nvarj, ...) to move in (or out) of the model (irrespective 
of FENTER and FREMOVE limits). A variable's index is its order in the input data file. 
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~ 


NUMBER=n 
AUTO 
ENTER= 
REMOVE=p 
FENTER=n 


FREMOVE=n 


moves a number of variables into or out of the model 
completes the stepping automatically 

probability of F-to-enter limit 

probability of F-to-remove limit 


F-to-enter limit. Variables with F>n are entered into the model if TOL 
permits. The default is 4.0. 


F-to-remove limit. Variables with F<n are removed. The default is 3.9. 


Stepping command sequences: 


FORWARD 


BACKWARD 


Stepping 
Interactive Automatic 
poo Eee ee eee es Dae 
START/FORWARD | START/FORWARD 
STEP... STEP/AUTO 
STEP... after last step 
; STOP 
STOP 
START/BACK START/BACK 
STEP... STEP/AUTO 
STEP... after last step 
: STOP 
STOP 
No Stepping 


ESTIMATE 
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Examples: 

STEP / AUTO 

STEP 

STEP AGE INCOME / FENTER=8 FREMOVE=7.9 


STEP 13 36 28 


STOP command 


STOP 


Stops the stepping and prints final output (that is, computations for the classification 
matrices, eigenvalues, canonical variables, etc.). 
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FACTOR: 
Factor Analysis 


FACTOR provides principal components analysis (PCA), maximum likelihood analysis 
(MLA), and iterated principal axis (IPA). SYSTAT has options to rotate, sort, plot, and 
save factor loadings. With the PCA method, you can also save the scores and 
coefficients. Orthogonal methods of rotation include varimax, equamax, quartimax, 
and orthomax. A direct oblimin method is also available for oblique rotation. Users can 
explore other rotations by interactively rotating a 3-D Quick Graph plot of the factor 
loadings. Various inferential statistics (for example, confidence intervals, standard 
errors, and chi-square tests) are provided depending on the nature of the analysis that 
is run. 

Before factoring, FACTOR reads a correlation or covariance matrix or computes one 
from a rectangular SYSTAT data file. When data values are missing, SYSTAT can 
compute correlations or covariances using only those cases with all values present 
(listwise deletion) or it can compute each statistic using cases with both values present 
(pairwise deletion). 


Setup: 
* FACTOR 
* USE 
z MODEL 
PLENGTH 
SAVE 
HOT + ESTIMATE 
MODEL command 


MODEL varlist 


Computes principal components or extracts factors for all variables in varlist. 
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PLENGTH command 


FACTOR: Factor Analysis 


PLENGTH SHORT 
MEDIUM 
LONG 


Controls the amount of output reported. To request extended results, enter LENGTH 
prior to ESTIMATE. Select one of three categories of output. 


SHORT 


MEDIUM 


LONG 


SAVE command 


Latent roots or eigenvalues (not MLA), initial and final communality esti- 
mates (not PCA), component loadings (PCA) or factor pattern (MLA, IPA), 
variance explained by components (PCA) or factors (MLA, IPA), percent- 
age of total variance explained, change in uniqueness and log likelihood at 
each iteration (MLA only), and canonical correlations (MLA only). When a 
rotation is requested: rotated loadings (PCA) or pattern (MLA, IPA) matrix, 
variance explained by rotated components, percentage of total variance 
explained, and correlations among oblique components or factors (oblimin 
only). 


SHORT output plus the matrix to factor, the chi-square test that all eigen- 
values are equal (PCA only), the chi-square test that last k eigenvalues are 
equal (PCA only), and differences of original correlations or covariances 
minus fitted values. For covariance matrix input (not MLA or IPA): asymp- 
totic 95% confidence limits for the eigenvalues and estimates of the popu- 
lation eigenvalues with standard errors. 


MEDIUM output plus latent vectors (eigenvectors) with standard errors (not 
MLA) and the chi-square test that the number of factors is k (MLA only). 
With an oblimin rotation: direct and indirect contribution of factors to vari- 
ances and the rotated structure matrix. 


SAVE filename 


Saves specified statistic to a file. You cannot save casewise results when using 
correlation or covariance matrix input. The statistics available to be saved depend on 


the analysis. 
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Principal components (PCA) offers the following seven options: 


/ SCORES standardized factor scores (mean 0, variance 1) 


DATA data. Use with options for scores 
LOAD component or factor loadings 
COEF coefficients for computing factor scores. Use with standardized variables. If 


input is COVA, multiply the coefficients times the original variables. 
VECTORS  eigenvectors ofthe correlation or covariance matrix. Do not use with ROTATE, 


IPA, MLA. 

PC unstandardized factor scores (mean 0, variance is the eigenvalue for each fac- 
tor) 

RESID residuals (actual z scores minus predicted z scores) for each case plus Q (sum of 


the squared residuals) and PROB (the upper-tail probability for Q) 


With SCORES or PC, 7? (Hotelling's 7? ) and PROB (the associated probability) and 
the input data are also saved. 


Maximum Likelihood (MLA) and the Common Factor Model offer the following 
option: 


/ LOAD component or factor loadings 


Factor scores for these methods are unavailable because they are indeterminate. 


ESTIMATE command 


ESTIMATE 


Tells SYSTAT to start the computations. 


For cases-by-variables input 


/LISTWISE _ if any value is missing among the selected variables, omit the case from the 
analysis. This is the default option. 


PAIRWISE computes correlations or covariances separately for each pair of variables 
using cases that have both values present. 
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For Corr and Cova input 


/N=n sample size of the input correlation or covariance matrix (not used with 
cases-by-variables input). When PLENGTH LONG, SYSTAT uses п for 
inferential statistics. 


Analysis and output 
/ CORR specifies the type of matrix to factor for cases-by-variables input. Not 
COVA needed when inputting a COVA or CORR matrix from a SYSTAT file. 


METHOD= PCA principal components method of factor extraction. 

IPA iterated principal axis factoring, an iterative method that finds common 
factors by starting with the principal components solution and itera- 
tively solving for communalities. 

MLA maximum likelihood factor analysis, a method that iteratively finds 
communalities and common factors. 

NUMBER=n number of factors to print and rotate, n must be an integer between 1 
and the number of variables specified in MODEL. The default is the 
number of variables. If both EIGEN and NUMBER are used, the 
smaller number of components is used. For the maximum likelihood 
method, NUMBER is ignored when п exceeds the number of degrees 
of freedom. 


For cases-by-variables input 


/SORT sorts the factor loadings from largest to smallest and groups 
loadings > |0.5| together within each component. 
EIGEN=n smallest eigenvalue (for its associated factor) to retain. Only components 


with eigenvalues larger than п are printed and rotated. If both EIGEN and 
NUMBER are specified, the smaller number of components is used. 
EIGEN is ignored for the maximum likelihood method. The default is 1.0. 
For COVA, the default is the average variance. 


ITER=n number of iterations to perform for the MLA and IPA methods. The default 
is 25. 
CONV=n convergence criterion for the MLA and IPA methods where 0 <= n <= 1. 


The default is 0.001. 
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Rotation methods: 


/ ROTATE = 


GAMMA = n 


VARIMAX 
EQUAMAX 
QUARTIMAX 
ORTHOMAX 
OBLIMIN 


rotates factors by the specified method. OBLIMIN produces 
direct oblimin rotation or “oblique” rotation allowing the fac- 
tors to be correlated. The default is no rotation. 


use with ORTHOMAX or OBLIMIN. For ORTHOMAX, the 
default is 1. For OBLIMIN, the default is 0. You can use 
GAMMA=0 for moderate correlations, use positive values to 
allow higher correlations, and use negative values to restrict 
correlations. Varying GAMMA with ORTHOMAX changes 
(gradually) the maximization of variance of loadings from col- 
umns (VARIMAX) to rows (QUARTIMAX). 
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FITDIST: 
Fitting Distributions 


FITDIST module fits probability distribution(s) to specified data. You can specify 
appropriate parameter(s) for the distribution under consideration. In case the required 
parameter(s) is (are) not specified, SYSTAT estimates the parameter(s). For discrete 
distributions, you can use either raw data or a frequency distribution as input. For 
continuous distributions, you can only use raw data; you can select more than one 
distribution for fitting, without specifying the parameters. Two goodness of fit tests are 
available: Chi-square test and Kolmogorov-Smirnov test. 


Setup: 
* FITDIST 
ы USE 
SAVE 
HOT * DISCRETE or CONTINUOUS 
DISCRETE command 


DISCRETE varlist 
Fits discrete distribution to all numerical variables in variist . 


| DISTRIBUTION = name specifies name of discrete distribution. 
The default is P (Poisson). 


DISTRIBUTION = name(parameter_list) specifies the name of a discrete distribution 
and its parameter(s). 


Examples: 
DISTRIBUTION=P 
DISTRIBUTION=N(10) 


DISTRIBUTION=NB(2,0.5) 
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Note: For discrete distributions, FREQUENCY option is available. You can use 
FREQUENCY before DISCRETE command. 


Distribution Name Parameter list 

Benford's law BL B 

Binomial N n porn 

Discrete uniform DU N 

Geometric GE p 

Hypergeometric H N,m,n or m,n 

Logarithmic series LS theta 

Negative binomial NB kp 

Poisson P lambda 

Zipf 21 shp 
CONTINUOUS command 


CONTINUOUS varlist 
Fits continuous distribution(s) to all numerical variables in varlist. 


/ DISTRIBUTION = namelist specifies list of distribution name(s) for fit- 
ting. The default is Z (normal). 


DISTRIBUTION = name(parameter list) specifies the distribution and the associated 
parameter(s). 
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Examples: 


DISTRIBUTION=L GU C 
DISTRIBUTION=Z(10, 2) 


Distribution 

Beta 

Cauchy 

Chi-square 

Erlang 

Exponential 

Gamma 

Gompertz 

Gumbel 

Inverse Gaussian/Wald 
Laplace/Double exponential 
Logistic 

Logitnormal 
Loglogistic 
Lognormal 

Normal 

Pareto 

Rayleigh 

Smallest extreme value 
Triangular 

Uniform 

Weibull 


Name 
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Parameter_list 
shp1, shp2 
loc, sc 

df 

shp, sc 
loc, sc 
shp, sc 
b,c 

loc, sc 

loc, sc 

loc, sc 


loc, sc 
loc, sc 
logsc, shp 
loc, sc 
loc, sc 
thr, sc 
sc 

loc, sc 

а, Б, с 
тіп, тах 
sc, shp 
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SAVE command 


ree eee 
SAVE filename Ei 


Saves the observed and expected frequencies along with class intervals and name 
(name list) of distribution(s) fitted to data in column(s) specified in varlist. 
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General Linear Models 


General Linear Model (GLM) can estimate and test any univariate or multivariate 
general linear model, including those for multiple regression, analysis of variance or 
covariance, and other procedures such as discriminant analysis and principal 
components. With the general linear model, you can explore randomized block 
designs, incomplete block designs, fractional factorial designs, Latin square designs, 
split plot designs, crossover designs, nesting, and more. The model is: 


Y=XBte 


where Y is a vector or matrix of dependent variables, X is a vector or matrix of 
independent variables, B is a vector or matrix of regression coefficients, and e is a 
vector or matrix of random errors. See Searle (1971), Winer (1971), Neter et al. (1996), 
or Cohen and Cohen (1983) for details. 


ANOVA and ANCOVA. GLM can handle a wide variety of balanced and unbalanced 
analysis of variance designs, including one-way analysis of variance, factorial 
ANOVA models, randomized block designs, incomplete block designs, fractional 
factorials, Latin square designs, and analysis of covariance with one or more 
covariates. SYSTAT also includes repeated measures, split plot, and crossover designs. 
It includes both univariate and multivariate approaches to repeated measures designs. 

The means model for missing cells designs is available. For models with fixed and 
random effects, you can define error terms for specific hypotheses. You can also do 
stepwise ANOVA (that is, Type I Sum of Squares). Categorical variables are entered 
or deleted in blocks, and you can examine interactively or automatically all 
combinations of interactions and main effects. Once you have estimated your ANOVA 
model, it is easy to test post hoc pairwise differences in means or to test any contrast 
across cell means, including simple effects. 


Regression. GLM can estimate and test simple and multiple linear regression models. 
Variations include mixture models (constraining the independent variables to sum to a 
constant), polynomial regression (including terms like x? and x? in the model), and 
stepwise regression (choosing a subset of predictors from among many candidates). 
You can store results of the analysis, predicted values, residuals, and diagnostics that 
identify unusual cases for further use in examining assumptions. 
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Multivariate models. In all the multivariate models, Y is a matrix of continuous 
measures. The X matrix can be either continuous or categorical dummy variables, 
according to the type of model. For discriminant analysis, X is a matrix of dummy 
variables, as in ANOVA. For principal components analysis, X is a single column of 
178, namely, a constant. For canonical correlation, X is usually a matrix of continuous 
right-hand variables (and Y is the matrix of left-hand variables), 


Testing hypotheses. Once the parameters of a model have been estimated, they can be 
tested by any general linear hypothesis of the following form: 


ABC'=D 


where A is a matrix of linear weights on coefficients across the independent variables 
(the rows of B), C is a matrix of linear weights on the coefficients across dependent 
variables (the columns of B), B is the matrix of regression coefficients or effects, and 
D is a null hypothesis matrix (usually a null matrix). For the multivariate models 
described here, the C matrix is an identity matrix and the D matrix is null. The A matrix 
can have several different forms, but these are all submatrices of an identity matrix, and 
are easily formed using HYPOTHESIS. 
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Setup: 


HOT 


* GLM 


* 


* 


USE 

MODEL 

CATEGORY 
PLENGTH 

SAVE 

ESTIMATE or START 
STEP 

(series of steps) 
STEP 

STOP 


HOT 


GLM: General Linear Models 


HYPOTHESIS 

ALL 

EFFECT 

WITHIN 

SPECIFY (contrast language) 
CONTRAST /matrix] 
AMATRIX [matrix] 
CMATRIX /matrix] 
DMATRIX [matrix] 
POST 

PAIRWISE 

ERROR 

STAND 

PRIORS 

FACTOR 

TYPE 

ROTATE п 

SAVE 

TEST 


194 | 
GLM: General Linear Models 


GLM features and options by type of analysis 


For Regression, consider: 


MODEL /N=n 

ESTIMATE / MIX NTEST = KS, AD, SW QUICK or NOQUICK 

START / BACKWARD FORWARD ENTER REMOVE FORCE TOL MAXSTEP 
STEP / AUTO ENTER REMOVE FENTER FREMOVE 

STOP / QUICK or NOQUICK 


For ANOVA and ANCOVA, consider: 


MODEL / REPEAT NAMES MEANS WEIGHT 

CATEGORY / EFFECT DUMMY 

ESTIMATE / NTEST = KS, AD, SW HTEST = LEVENE SS = TYPE1 or TYPE2 
or TYPE3 QUICK or NOQUICK 

Under HYPOTHESIS 

POST / LSD TUKEY BONF SCHEFFE SIDAK BTUKEY DUNCAN GT2 
GABR QREG GH T2 T3 POOLED SEPARATE DUNNETT = LT or 
GT or TWO CONTROL 

WITHIN 

CONTRAST /ADJDIFF POLYNOMIAL SUM ORDER METRIC DEVIATION [c] 
SIMPLE [c] HEL RHEL 

SPECIFY / POOLED SEPARATE 

CMATRIX 

ERROR 

SAVE 


For Multivariate Models, consider: 


PRIORS 

STANDARDIZE / TOTAL or WITHIN 

FACTOR / HYPOTHESIS or ERROR 

TYPE / SSCP or COVARIANCE or CORRELATION 
ROTATE 


SAVE 
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Model Building 


MODEL command 


MODEL varlist1 = CONSTANT + varlist2 + var1*var2 + var3(var4) 


Specifies the linear model to estimate. varlist1 specifies the dependent variable(s). 
varlist2 specifies main effects (independent variables). CONSTANT is an optional 
parameter (when in doubt, include the CONSTANT). Specify interactions by linking 
variables with an asterisk. The parentheses denote nested factors. Use CATEGORY to 
specify all design factors. Model variables not tested with CATEGORY are covariates. 


| REPEAT2mj, ... number of levels for each repeated measures factor. 
For example, if you have DAY with three levels and 
AM PM with two levels, type REPEAT- 3, 2 for data 
ordered as DI AM, DI PM, D2 AM, D2 PM, 
рз AM, Юз PM. 


REPEAT=m(x1,x2, ...) n(y1,y2, ...) m and n are as defined above. Use xi and yi to specify 
the scale for orthogonal polynomials. Specify one 
number for each level of the repeated measures. 


NAMES-'namef', 'пате2', ... a name for each repeated measures factor. 

MEANS specifies a fully factorial design using means coding. 
WEIGHT weights cell means by the cell counts before averaging 
N=n when your data file is a symmetric matrix (for exam- 


ple, a covariance matrix), specify the sample size n 
that generated the matrix. 


Note: To display results for single degree of freedom polynomial contrasts for each 
repeated measure factor, insert the command PLENGTH MEDIUM before ESTIMATE. 


Examples: 
MODEL Y = CONSTANT + A + B + C + A'B + B'C 
MODEL LEARNING = CONSTANT + SCHOOL + TEACHER(SCHOOL) 
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CATEGORY command 


CATEGORY grpvarlist 


Specifies numeric or string grouping variables that define cells. Specify for all 
categorical variables for which GLM should generate design variables. 


/MISS allows cases with a missing value for the categorical variable to be included 
in the analysis 
EFFECT produces parameter estimates which are differences from group means 
DUMMY produces dummy codes for the design variables instead of effect codes 


Example: 


CAT SEX DOSE$ 


SAVE command 


SAVE filename 


Before ESTIMATE. Saves specified statistics to a file and displays the 
Durbin-Watson statistic, the first-order auto-correlation, and identifies outliers, if any. 


1 COEF estimates of the regression coefficients 
MODEL specifies variables from your MODEL statement in addition to the statistics 
given by RESID 
RESID produces residuals, predicted values, and diagnostics. For a univariate 


model (one dependent variable), this includes the estimate, residual, lever- 
age, Cook’s D, Studentized residual, and standard error of prediction for 
each case. For a multivariate model, SYSTAT saves the estimate, residual, 
and leverage. 


DATA saves data from original file 
PARTIAL partial residuals 
ADJUSTED adjusted cell means from analysis of covariance 


After HYPOTHESIS. For discriminant analysis, this command saves canonical 
variable scores (FACTOR), Mahalanobis distances to each group center (DISTANCES), 
posterior probabilities (PROB), original group membership (GROUP), and predicted 
group membership (PREDICT). For principal components, scores are saved (FACTOR). 
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For canonical correlations, canonical scores for the left-hand set (FACTOR) are saved. 
For right-hand set scores, reverse the sets and refit your model. 


PLENGTH command 


PLENGTH SHORT 
MEDIU 
LONG 


For estimation, MEDIUM prints least-squares cell means and standard deviations with 
analysis of variance. LONG adds the total sum or product matrix, residual (or pooled 
within groups) sum of product matrix, residual (or pooled within groups) covariance 
matrix, and the residual (or pooled within groups) correlation matrix. For hypothesis 


testing, LONG adds A, C, and D matrices, the matrix of contrasts, and the inverse of 


the cross-products of contrasts, hypothesis and error sum of product matrices, tests of 


residual roots, canonical correlations, coefficients, and loadings. 
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ESTIMATE command 
ESTIMATE 
Tells SYSTAT to estimate the analysis specified in MODEL. For stepwise procedures, 
use START instead. 
/ MIX estimates a mixture model. Your independent variables must sum to a 
constant 
TOL=n prevents entry of a variable that is highly correlated with the 
independent variable already included in the model 
NTEST = KS Kolmogorov-Smirnov test (Lilliefors) for testing normality 
sw Shapiro-Wilks test for normality 
AD Anderson-Darling test for normality 
HTEST= LEVENE  Levene's test for homogeneity of variances 
SS = TYPE1 uses type I sum of squares for analysis 
TYPE2 uses type IT sum of squares for analysis 
TYPE3 uses type III sum of squares for analysis 
CONFI = n displays the confidence intervals of the regression coefficients at the 
desired level of confidence. The confidence intervals are displayed only 
when the MEDIUM and LONG options of PLENGTH are active. The 
default level is 0.95 
QUICK displays quick graphs even though global graph option is off. 
NOQUICK 


Stepwise model building 


START command 


suppresses quick graphs even though global gragh option is on. 


START 


Use in place of ESTIMATE to tell SYSTAT to produce preliminary information for a 
stepwise model building process (use STEP to continue). 
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| FORWARD initializes GLM for forward stepping. SYSTAT reports results for Step 0. If 
omitted, all variables in MODEL are entered/ removed from the model that 
pass FENTER/FREMOVE limits 
BACKWARD  initializes GLM for backward stepping 
TOL=n the matrix inversion tolerance limit. The default value is 0.001. 
ENTER=p probability of F-to-enter limit 
REMOVE-p probability of F-to-remove limit 


FENTER=n  F-to-enter limit. Variables with F > n are entered into the model if TOL per- 
mits. The default value is 4. 


FREMOVE-n F-to-remove limit. Variables with F «n are removed. The default value is 
3.9. 


FORCE=n forces the first n variables in MODEL into the model. The default value is 0. 
MAXSTEP=n maximum number of steps to take 


Examples: 


START / BACK 


START / FENTER=10 FREMOVE=9.9 


STEP command 


STEP no argument 
var or index 


Interactively selects a variable to enter or remove from the model at each step or 
automatically lets SYSTAT select candidate variables to move. For automatic 
stepping, specify no argument (values of FENTER and FREMOVE guide the selection) 
and include AUTO in the option list. For interactive stepping, specify no argument or 
specify name (or index number) of variable to move in (or out) of the model 
(irrespective of FENTER and FREMOVE limits). A variable’s index is its order in the 
input data file. Repeat STEP to continue moving variables. 
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/AUTO completes the stepping automatically 
ENTER=p probability of F-to-enter limit 
REMOVE=p probability of F-to-remove limit 
FENTER=n F-to-enter limit. Variables with F > п are entered if TOL permits. 
The default value is 4. 


FREMOVE=n F-to-remove limit. Variables with F « n are removed. The default value is 
3.9. 


Examples: 

STEP / AUTO 

STEP AGE 

STEP INCOME / FENTER-8 FREMOVE-7,9 
STEP 13 


STOP command 


STOP 


Stops the stepping and prints final output. The model is recomputed using all cases that 
have no values missing for the final subset of variables (that is, fewer cases can be used 
at the last step than for this final report). 


/ QUICK displays quick graphs even though global graph option is off. 
NOQUICK suppresses quick graphs even though global graph option is on. 


Hypothesis testing 


Toggling among command line and GUI is supported in ANOVA, GLM, MANOVA, 
MIXED, LOGIT and RSM. That is, if estimation is performed through dialog box then 
post estimation analysis can be performed through commands and vice-versa. 
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HYPOTHESIS command 


HYPOTHESIS 


Tells SYSTAT that you want to test hypotheses on a previous MODEL. You must enter 
HYPOTHESIS before you can use any of the following commands. 


ALL command 


ALL 


Tests all of the coefficients in the model. For canonical correlations, you need to 
specify ALL. ALL is a shortcut if you want to test all the effects in your model. This 
automatically generates separate tests for each effect in the model. 


EFFECT command 


EFFECT varlist, var & var2, ... 


Specifies the effect(s) to include in the hypothesis test. Varlist specifies main effects. 
The main effects are jointly specified by joining the main effects by and sign '&'. 
Interactions are specified by joining two variables with an asterisk(*). For discriminant 
analysis, specify the name of your grouping variable. For principal components, you 
do not need to specify an EFFECT unless you have a grouping variable for within- 
groups components. For canonical correlations, you need to specify ALL. 


Example: 


EFFECT VAR1 & VAR2 & VAR3 & VART'VAR2 
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WITHIN command 


WITHIN ‘factorname” 


Specifies hypotheses across a single repeated measures factor. The name specified with 
WITHIN is from the NAMES option of the MODEL statement. Use CONTRAST to 
construct the specified contrasts. 


SPECIFY command 


SPECIFY hypothesis language / options 


Automatically codes A, C, and D matrices. The syntax allows for coefficients, 
addition, subtraction, multiplication of coefficients, equality, and constants: 


-3*DRUG[1] — 1*DRUG[2] + 1*DRUG[3] + 3*DRUG[4]-0 


With the effects model, effects are contrasted. With the means model, means are 
contrasted. 


/ POOLED uses the error term from the current model. 
SEPARATE generates a separate variances error term from the hypothesis estimated. 


Separate variance hypothesis tests are limited to a single degree of freedom. 


Example: 


SPECIFY DRUG[1] DISEASE[2]-DRUG[3] DISEASE[2] 


CONTRAST command 


CONTRAST [matrix] 


Generates a contrast on a repeated measures factor or grouping variable. Identify the 
repeated measures factor with WITHIN or grouping variables with EFFECT. 
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1 ADJDIFF 
DEVIATION[c] 


SIMPLE [c] 
HELMERT 
RHELMERT 
POLYNOMIAL 


ORDER=n 
METRIC2m,n,... 


SUM 


Examples: 


WITHIN 'TIME' 


GLM: General Linear Models 


compares each level of the factor with its adjacent level 

compares the mean of each level of the selected categorical variable 
(excluding reference level c) to the mean of all the levels (grand mean) 
compares the mean of each level to the mean of a specified 

reference level c 

compares the mean of each level of the selected factor with the mean of 
succeeding levels 

compares the mean of each level of the selected factor with the mean of 
the previous levels 

generates orthogonal polynomial comparisons. The default is equally 
spaced polynomials 

a polynomial of order n. The default is all orthogonal polynomials. 
generates unequally spaced orthogonal polynomials. Specify one number 
per level 

computes and tests the sum of the levels 


CONTRAST / POLY, METRIC-2,4,8,16 


CONTRAST [-3 -1 1 3] 


AMATRIX command 


AMATRIX [matrix] 


Specifies a design matrix for your hypothesis (for between-subjects effects). 


Example: 


AMATRIX [0 10-100 1] 


CMATRIX command 


CMATRIX [matrix] 
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Specifies a design matrix for the dependent variables. 


Example: 


CMATRIX [-1 —1 —1 1 1 1] 


DMATRIX command 


DMATRIX [matrix] 


Identifies the value of the null hypothesis. The default value is 0 for each effect. 


Example: 


DMATRIX [10 15] 


POST command 


POST grpvar 


Performs pairwise mean comparisons. grpvar is an effect without covariates as it 
appears in the model command with the effects model. With the means model, grpvar 
indicates a set of marginal means. 


/ LSD 
TUKEY 
BONF=n 


SCHEFFE 

SIDAK 

DUNNETT = LT 
GT 
TWO 


CONTROL="leveiname’ 


SNK 


LSD pairwise comparisons 
Tukey-Kramer HSD pairwise comparisons 


Bonferroni comparisons (n is optional for the number of 
comparisons you planned a priori to make) 


Scheffé pairwise comparisons 

Pairwise comparisons based on Student's t statistic 
specifies less than alternative for Dunnett test 
specifies greater than alternative for Dunnett test 
specifies two-sided alternative for Dunnett test 


use with Dunnett to name the control group against which 
comparisons are made 


Student-Newman-Keuls pairwise comparisons 
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BTUKEY 
DUNCAN 
GT2 
GABR 
QREG 


GH 

T2 

T3 
POOLED 
SEPARATE 


Example: 


POST DRUG / BONF 


PAIRWISE command 


GLM: General Linear Models 


Tukey’s b pairwise comparisons 
Duncan pairwise comparisons 
Hochberg’s GT2 pairwise comparisons 
Gabriel pairwise comparisons 


Ryan-Einot-Gabriel-Welsch pairwise comparisons based on 
Studentized range distribution 


Games-Howell pairwise comparisons for unequal variances 
Tamhane’s T2 pairwise comparisons for unequal variances 
Dunnett T3 pairwise comparisons for unequal variances 
uses the error term from the current model 

separate variance option-can cross with pairwise mean 
comparisons 


PAIRWISE factorname / BONFERRONI or SIDAK 


Performs Post hoc test for Repeated Measures. 


ERROR command 


ERROR varlist 


var1 & var2 


value (df) 


matrix 


Specifies an error term for hypothesis testing. Varlist specifies main effects. 
Interactions are specified by joining two variables with the and sign (&). Value 


indicates a numeric val 


ue for the mean square error optionally followed by degrees of 


freedom (df) enclosed in parentheses. For multivariate designs, follow the error 
command with an error covariance matrix (press Enter before entering the matrix). 


206 
GLM: General Linear Models 


Examples: 
ERROR PLOT(A) 


ERROR 49.8(4) 


STAND command 


STAND TOTAL 
WITHIN 


Specifies the method of standardizing canonical coefficients. WITHIN is used in 
discriminant analysis to make comparisons easier when the measures are on different 
scales. TOTAL is used in canonical correlations. WITHIN is the default. 


PRIORS command 


PRIORS m пр... 


Specifies prior probabilities of group membership in discriminant analysis. m 
corresponds to the first level of your grouping variable, n to the second, and so on. The 
default is equal probability for each group. If the values do not sum to 1, SYSTAT 
normalizes them so that they do. 


Example: 


PRIORS .5, .3, .2 


FACTOR command 


FACTOR HYPOTHESIS 
ERROR 


In a factor analysis with grouping variables, identifies the matrix to factor. 
HYPOTHESIS factors the sum of product hypothesis matrix. ERROR factors the 
residual correlation, covariance or SSCP matrix (depending on the type of file used). 
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TYPE command 


TYPE CORR 
COVAR 
SSCP 


For factor analysis, specify the type of matrix factor. All factoring is done with listwise 
deletion. 


ROTATE command 


ROTATE n 


Rotates n factors by varimax rotation. Usually used for principal components analysis. 
For canonical correlations, use ROTATE to rotate the dependent canonical factors. For 
discriminant analysis, rotate the canonical variables. 


SAVE command 


SAVE filename 


Saves scores and results to SYSTAT data file. 


TEST command 


TEST 
Initiates the test of your hypothesis. 


| CONFI = n level of confidence for hypothesis estimates (0< n <1). The default value is 
0.95. 
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LOGIT: 
Logistic Regression 


The LOGIT module is SYSTAT’s comprehensive package for logistic regression 
analysis. It provides tools for model building, model evaluation, prediction, 
simulation, hypothesis testing, and regression diagnostics. 

LOGIT estimates a multinomial LOGIT model for qualitative dependent variables. It 
performs conditional logistic regression, the econometric discrete choice model, 
general linear hypothesis testing, score tests, odds ratios and confidence intervals, 
forward, backward and interactive stepwise regression, Pregibon regression 
diagnostics, prediction success and classification tables, independent variable 
derivatives and elasticities, model-based simulation of response curves, deciles of risk 
tables, options to specify start values and to separate data into learning and test 
samples, robust standard errors, control of significance levels for confidence interval 
calculations, zero/one dependent variable coding, choice of reference group in 
automatic dummy variable generation, and integrated plotting tools. 
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Setup: 


* LOGIT 


* 


HOT * 


HOT 


USE 
CATEGORY 
NCAT 
ALT 
SET 
MODEL 
PLENGTH 
SAVE 
ESTIMATE or START 
STEP 
(series of STEP commands) 
STOP 
SAVE 
DC 
QNTL 
SIMULATE 
HYPOTHESIS 
CONSTRAIN 
TEST 


MODEL command 


LOGIT: Logistic Regression 


MODEL depvar = CONSTANT + indvarexp 


depvar = condvarlist; polyvarlist 


Specifies a model to be estimated. The optional parameter CONSTANT adds a constant 
to the logit model. There are three forms of the MODEL statement available; the correct 
form depends on the type of model you want to estimate and how your data are 
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organized. The parameter names are used in the MODEL statement for conditional and 
discrete choice problems. 


Model Type MODEL Statement 

Pure Polytomous MODEL depvar = CONSTANT + indvarlist 

(binary or multinomial) 

Conditional Logit MODEL depvar = condvarlist; polyvarlist 

Discrete Choice MODEL depvar = label] + label2 +... 
where label denotes a set of variables defined by the SET 
command. 

Example: 


MODEL LOW=CONSTANT+LWD 


SAVE command 


SAVE filename 


Saves specified statistics to a file. 


/ PREDICTED saves the predicted probabilities. 
ROC saves the ROC curve points only for binary logit models. 
PLENGTH command 


PLENGTH SHORT 
LONG 


Controls the amount of output reported. To request extended results, enter PLENGTH 
prior to ESTIMATE. Select one of four categories of output: 


SHORT produces loglikelihood at each iteration, parameter estimates and their 
standard errors, z-ratio, p-values, confidence intervals, log likelihood 
of constant only model, Mc-Faddens Rho square, Cox and Snell R 
square, and Nagelekere's R square. In case of binary logisitc regression, 
produces ROC curve as quick graph. 


LONG SHORT plus covariance matrix and correlation matrix of the parameter 
estimates. 
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ESTIMATE 


Causes the previously specified model to be estimated. 


/ PREDICT 
TOLERANCE-d 
ITER-n 


CONVERGE=d 
RSE 


MEANS 
CLASS 


CLASS = cutoff 
DERIVATIVE= INDIVIDUAL 
AVERAGE 


CONFI =n 


CATEGORY command 


produces a prediction-of-success table. 

sets the tolerance. The default value is 1.0E-12. 

sets the maximum number of iterations. The default 
value is 50. 

sets the convergence criterion. This is the largest rela- 
tive change in any coordinate before iterations termi- 
nate. The default value is 1.0E-6. 

produces robust standard errors. 

average value. 

with Multinomial Logit, produces a model classifica- 
tion table. 

with Binary Logit, produces a model classifiaction table 
with the desired cutoff point. The default cutoff point is 
0.5. 


with Multinomial Logit, produces a derivative table, 
evaluating derivatives for each INDIVIDUAL observa- 


tion, or once at the sample AVERAGE of the covariates. 


Specify the confidence level. The default value is 0.95 


CATEGORY grpvarlist 


Specifies numeric or string grouping variables that define cells. Specify for all 
categorical variables for which LOGIT should generate design variables. 


/ EFFECT produces parameter estimates that are differences from group means. 
DUMMY produces dummy codes for the design variables instead of effect codes. 
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SET command 


SET parameter = condvarlist 


Specifies conditional variables. 


Example: 


SET KEY = OPTION(1), OPTION(2), OPTION(3) 


NCAT command 


NCAT n 


Specifies the number of categories in the dependent variable. This is needed only for 
the by-choice data layout where the values of the dependent variable are not explicitly 
coded. 


ALT command 


ALT var 


Specifies the number of alternatives for a choice. It is needed only when the number of 
alternatives in a choice model varies per subject. 
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START command 


START 
Sets up stepwise regression. 


/ BACKWARD initializes LOGIT for backward stepping. 


FORWARD initializes LOGIT for forward stepping. SYSTAT reports results for step 0. 
If omitted, all variables in MODEL that pass ENTER/REMOVE limits are 
entered/removed from the model. 

ENTER-d probability of F-to-enter limit. The default value is 0.15. 

REMOVE-d probability of F-to-remove limit. The default value is 0.15. 

FORCE=n forces the first n variables in MODEL into the model. The default value is 
0. 


MAXSTEP=n maximum number of steps to take. 


Example: 


START / ENTER = .05, REMOVE = .10 


STEP command 


STEP var 
+ 


Initiates stepwise model building. Using STEP with a variable name enters or removes 
that variable in a single step. 


/ AUTO specifies automated stepwise fitting. 


STOP command 


STOP 


Yields estimates for the final model. 
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SAVE command 


SAVE filename 


Saves the regression diagnostics to SYSTAT data file. The regression diagnostics are 
created if DC command is preceded by SAVE command. 


Post hoc testing 


DC command 


DC 


Computes deciles of risk and the Hosmer-Lemeshow chi-squared statistic for the last 
model estimated. 


/ BINS=n allocates approximately equal numbers of observations to each cell. 
The default value is 10. 


Р=р1,р2..рп specifies intervals by а list of probability values. 
Example: 


DC / P = 0.6850, 0.9360, 0.15320, 0.20630 


QNTL command 


QNTL var 


Calculates confidence intervals for quantiles based on the last model estimated. It will 
produce not only the LD50 but also other quantiles as well, with upper and lower 
bounds when they exist. 


/ covar1=d1, covar2=d2... specifies fixed values on the covariates over which quantiles are 
produced. 
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Example: 


QNTL AGE/CONSTANT=1, RACE[1]=1 


SIMULATE command 


SIMULATE уагї=01, var2=d2, 


Generates and saves predicted probabilities and odds ratios, using the last model 
estimated to evaluate a set of logits. The logits are calculated from a combination of 
fixed covariate values listed before the / symbol, and a grid of values following the /. 


/ DO vart=d1,d2,d3, sets values over which the parameters of the simulation are to vary. dí 


var2-d1,d2,d3 equals the start value, d2 equals the end value, and d3 equals the 
increment for each step. 


Example: 
SIMULATE CONSTANT=0, AGE=0, LWD=0 / DO LWD*AGE=15,45,5 


Generates statistics for values of the LWD*AGE interaction ranging from 15 to 45 (i.e. 
15, 20, 25, 30, 35, 40, 45) at the specified fixed CONSTANT, AGE, and LWD values. 


HYPOTHESIS command 


HYPOTHESIS 


Initiates a block of CONSTRAIN statements for specifying and testing a hypothesis. 


CONSTRAIN command 


CONSTRAIN argument 


Defines the restrictions being tested. These can include any linear algebraic expression 
without parentheses involving the parameters. If interactions were present on the 
MODEL statement, they can also appear on the CONSTRAIN statement. 
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Example: 
CONSTRAIN RACE[1]=0 


TEST command 


TEST 
Initiates testing of the hypothesis. 
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LOGLIN: 
Loglinear Modeling 


The loglinear model is a useful tool for analyzing relationships among the factors ofa 
multiway frequency table. LOGLIN computes maximum likelihood estimates of the 
parameters of a loglinear model using the Newton-Raphson method. For each 
user-specified model, LOGLIN provides a test-of-fit of the model, observed and 
expected cell frequencies, estimates of the loglinear parameters (lambdas), standard 
errors of the estimates, the ratio of each lambda to its standard error, and multiplicative 
effects (EXP(A)). 

For each cell, you can request its contribution to the Pearson chi-square or the 
likelihood ratio chi-square. Deviates, standardized deviates, Freeman-Tukey deviates, 
and likelihood ratio deviates are available to characterize departures of the observed 
values from expected values. 

When searching for the best model, you can request tests after removing each first- 
order effect or interaction term one at a time individually or hierarchically (when a 
lower-order effect is removed, so are its respective interaction terms). LOGLIN does not 
require that models be hierarchical. 

A model can explain the frequencies well in most cells, but poorly in a few. LOGLIN 
uses Freeman-Tukey deviates to identify the most divergent cell, to fita model without 
it, and, in a stepwise manner, to identify other outlier cells that depart from your model. 

You can specify cells that contain structural zeros (cells that are empty naturally or 
by design, not by sampling). Then LOGLIN can fit a model to the subset of cells that 
remain. A test of fit for such a model is often called a test of quasi-independence. 

For each level of a term included in your model, LOGLIN can save the estimate of 
lambda, the standard error of lambda, the ratio of lambda to its standard error, the 
multiplicative effect, and the marginal indices of the effect. Alternatively, for each cell, 
LOGLIN can save the observed and expected frequencies and its deviates (listed above), 
the Pearson and likelihood ratio chi-square contributions to the log-likelihood, and the 


cell indices. 
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Setup: 
* LOGLIN or * LOGLIN 
* USE * SUSE 
FREQ FREQ 
* — MODEL HOT*  TABULATE 
ZERO 
SAVE 
PLENGTH 
HOT* ESTIMATE 
MODEL command 
MODEL variables defining table = terms of model 
Specifies the frequency table and the terms (marginals) to use to predict the cell 
frequencies. Specify the table factors in the same order as they appear from left to right 
across the page. You can use this shortcut notation to define your model: 
A..D includes all variables from A to D (if stored consecutively in the 
data file); that is, 4 + B + C + D. 
A#B includes lower-order effects with the interaction term; that is, 
A+B+A*B. 
A .. D^ or (A+B+C+D)^i includes all ith and lower-order terms. 
+(...) removes terms (...) from preceding specification. 
Example: 
MODEL AGE*SEX*EDU = AGE + EDUC + AGE*EDU 


ESTIMATE command 


ESTIMATE 


Initiates the estimation process, defines the computational controls, and lists the 
result(s) to print: 
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| DELTA=n 


LCONV=n 
CONV=n 
TOL=n 
ITER=n 
HALF=n 


ZERO command 


LOGLIN: Loglinear Modeling 


value added to observed frequency in each cell. The default is 
0.0. 


log-likelihood convergence criteria. The default is 0.000001. 
parameter convergence criteria. The default is 0.0001. 
tolerance limit. The default is 0.001. 

maximum number of iterations. The default is 10. 

maximum number of step halvings. The default is 10. 


ZERO CELL n1, n2, ... 
EMPTY 


Specifies one or more cells for treatment as structural zeros. List the index (nf, n2, ...) 
of each factor in the order of the factor appears in the table. Repeat CELL for each cell. 
Each ZERO statement clears all cells previously identified as structural zeros. EMPTY 
causes all empty cells to be defined as structural zeros. 


Example: 


ZERO CELL = 1122 CELL 71321 


/ BLANK 


Example: 


Each cell which is defined as structural zero will be displayed as blank. This 
option will work only with EMPTY option. 


ZERO EMPTY / BLANK 
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SAVE command 


SAVE filename 
Specifies the filename and what to save in the file. 


/ ESTIMATES foreach cell in the table, saves the observed and expected frequencies and 
their differences, stardardized and Freeman-Tukey deviates, the contribution 
to the Pearson and likelihood ratio chi-square statistics, the contribution to the 
log-likelihood, and the cell indices. 

LAMBDAS for each level of each term in the model, saves the estimate of lambda, the 
standard error of lambda, the ratio of lambda to its standard error: the multi- 
plicative effect (EXP(lambda)), and the indices of the table of factors. 


PLENGTH command 


PLENGTH NONE 
SHORT 
MEDIUM 
LONG 


Controls the amount of output reported and requests specific output panels to print. 
Optionally, include one or more panels not provided with your argument specification. 


/CELLS=n number of outlandish cells to identify. The default for SHORT, MEDIUM, and 
LONG are 3, 5, and 10, respectively. 


The argument SHORT requests the following four features: 


OBSFREQ observed frequencies 

CHISQ Pearson and likelihood ratio chi-square statistics 
RATIO lambda divided by standard error of lambda 
MLE log of the model’s maximized likelihood value 
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MEDIUM requests the output for SHORT above plus the following five features: 


EXPECT 
STAND 
ELAMBDA 
TERM 
HTERM 


expected frequency for each cell (current model) 
standardized deviates 

multiplicative effects (EXP(A)) 

tests each term by removing it from the model 


tests each term by removing it and its higher order interactions from 
the model 


LONG requests the output for SHORT and MEDIUM above plus the following features: 


PARAM 
COVA 
CORR 
LAMBDA 
SELAMBDA 
DEVIATES 
LRDEV 
FTDEV 
PEARSON 
LOGLIKE 


TABULATE command 


coefficients of design variables 

covariance matrix of the parameters 

correlation matrix of the parameters 

additive effect of each level for each term 

standard errors of lambdas 

observed-expected frequency for each cell 
likelihood ratio of deviate for each cell 
Freeman-Tukey deviate for each cell 

contribution to Pearson chi-square from each cell 
contribution to model’s log-likelihood from each cell 


TABULATE var * var2 * ... 


Prints multiway table. Use only when no analysis is desired. Specify table factors in 
the same order as they appear from left to right across the page. 


Example: 


TAB CENTERS * AGE * TUMORS 
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MANOVA: 
Multivariate Analysis of Variance 


Use MANOVA to perform multivariate analysis of variance (MANOVA) or Multivariate 
analysis of covariance (MANCOVA) with balanced and unbalanced designs; it handles 
a wide variety of balanced and unbalanced multivariate analysis of variance designs. 
SYSTAT includes repeated measures, split plot, nested designs and much more. It also 
includes both univariate and multivariate approaches to repeated measures designs. 
The means model for missing cells designs is available with MANOVA model. 

Once you have estimated your MANOVA model, it is easy to compare means or 
linear combination of means or to test any contrast across cell means, including simple 
effects by using hypothesis test, between and within group testing. After parameters of 
a model have been estimated they can be tested by any general linear hypothesis of the 
following form: 


ABC’ =D 


where A is a matrix of linear weights on coefficients across the independent variables 
(the rows of B), C is a matrix of linear weights on the coefficients across dependent 
variables (the columns of B), B is the matrix of regression coefficients or effects, and 
D is a null hypothesis matrix (usually a null matrix). 
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Setup: 


MANOVA: Multivariate Analysis of Variance 


* MANOVA 


* 


* 


HOT * 


HOT * 


MANOVA features and options by type of analysis 


USE 

MODEL 
CATEGORY 
PLENGTH 

SAVE 

ESTIMATE 
HYPOTHESIS 
EFFECT 

WITHIN 

SPECIFY (contrast language) 
CONTRAST [matrix] 
AMATRIX [matrix] 
CMATRIX [matrix] 
DMATRIX [matrix] 
ERROR 

POST 

PAIRWISE 

TEST 


For multivariate multiple regression 


MANOVA 
MODEL 
ESTIMATE 


| Nan 


For MANOVA and MANCOVA 


MANOVA 


| REPEAT NAMES MEANS WEIGHT 


CATEGORY | EFFECT DUMMY MISS 


ESTIMATE 


1 SS= TYPE1 or TYPE2 or TYPES or QUICK or NOQUICK 
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Under HYPOTHESIS: 


HYPOTHESIS 
WITHIN 


CONTRAST / ADJDIFF POLYNOMIAL SUM ORDER METRIC HELMERT 
RHELMERT DEVIATION[c] SIMPLE[c] 


SPECIFY 
AMATRIX 
CMATRIX 
DMATRIX 
ERROR 
POST 
PAIRWISE 
TEST 


Model Building 


MODEL command 


MODEL varlist1 = CONSTANT + varlist2 + var*var2 + var3(var4) 


Specifies the linear model to estimate. Varlist1 specifies the dependent variables. 
Varlist2 specifies main effects (independent variables). CONSTANT is an optional 
parameter (when in doubt, include the CONSTANT). Specify interactions by linking 
variables with an asterisk. The parentheses denote nested factors. Use CATEGORY to 
specify all design factors. Model variables not tested with CATEGORY are covariates. 
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| REPEAT= m,n, ... number of levels for each repeated measures factor. For 
example, if you have DAY with three levels and 
AM PM with two levels, type REPEAT- 3, 2 for data 
ordered as DI. AM, рі РМ, D2_AM, D2_PM, 
D3_AM, D3_PM. 

REPEAT=m(x1,x2, ...) n(y1,y2, ...) m and п are as defined above. Use xi and yi to specify the 

scale for orthogonal polynomials. Specify one number 
for each level of the repeated measures. 


NAMES='nameT', ‘пате2', ... a name for each repeated measures factor. 

MEANS specifies a fully factorial design using means coding. 

WEIGHT weights cell means by the cell counts before 
averaging. 

N-n when your data file is a symmetric matrix (for example, 
a covariance matrix), specify the sample size n that gen- 
erated the matrix. 


Note: To display results for single degree of freedom polynomial contrasts for each 
repeated measure factor, insert the command PLENGTH MEDIUM before ESTIMATE. 


Examples: 


MODEL Y1 Y2..YP = CONSTANT +А +В + C + A'B + B'C 


MODEL LEARNING SCORE = CONSTANT + SCHOOL + TEACHER(SCHOOL) 


CATEGORY command 


CATEGORY grpvarlist 


Specifies numeric or string grouping variables that define cells. Specify for all 
categorical variables for which GLM should generate design variables. 


/ MISS allows cases with a missing value for the categorical variable to be included in 
the analysis. 
EFFECT produces parameter estimates which are differences from group means. 
DUMMY produces dummy codes for the design variables instead of effect codes. 
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Example: 


CAT SEX DOSE$/ MISS 


SAVE command 


SAVE filename 


Before ESTIMATE: Saves specified statistics to a file. 


Г COEF saves estimates of the regression coefficients. 
MODEL specifies variables from your MODEL statement in addition to the statistics 
given by RESID. 
RESID saves the estimate, residual, and leverage. 
DATA saves data from original file. 


ADJUSTED adjusted cell means from Multivariate analysis of covariance (MANCOVA). 


PLENGTH command 
PLENGTH SHORT 
MEDIUM 
LONG 
SHORT For estimation, SHORT prints dependent variable means, estimated parameters 
and information criterion. 
MEDIUM prints least-squares cell means and standard deviations with analysis of 
variance, 
LONG LONG adds the total sum of product matrix, residual (or pooled within groups) 


sum of product matrix, residual (or pooled within groups) covariance matrix, 
and the residual (or pooled within groups) correlation matrix. For hypothesis 
testing, LONG adds A, C, and D matrices, the matrix of contrasts, and the 
inverse of the cross-products of contrasts, hypothesis and error sum of product 
matrices, tests of residual roots, canonical correlations, coefficients, and 
loadings. 
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ESTIMATE command 


ESTIMATE 


Tells SYSTAT to estimate the analysis specified in MODEL. 


| SS=TYPE 1 Uses type I sum of squares for analysis. 
TYPE 2 Uses type II sum of squares for analysis. 
TYPE З Uses type Ш sum of squares for analysis. 
QUICK displays quick graphs even though global graph option is off. 
NOQUICK suppresses quick graphs even though global option is on. 


Hypothesis testing 
Toggling among command line and GUI is supported in ANOVA, GLM, MANOVA, 


MIXED, LOGIT and RSM. That is, if estimation is performed through dialog box then 
post estimation analysis can be performed through commands and vice-versa. 


HYPOTHESIS command 


HYPOTHESIS 


Tells SYSTAT that you want to test hypotheses on a previous MODEL. You must enter 
HYPOTHESIS before you can use any of the following commands. 


EFFECT command 


EFFECT var or var1*var2 or var1 & var2, ... 


Specifies the effect to include in the hypothesis test. Var specifies main effects. 
Interaction can be specified by joining two variable with asterisk (*) sign. 
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WITHIN command 


WITHIN ‘reptedfacname’ 


Specifies hypotheses across a single repeated measures factor. The name specified with 
WITHIN is from the NAMES option of the MODEL statement. Use contrast to construct 
specified contrasts. 


CONTRAST command 


CONTRAST [matrix] 


Generates a contrast on a repeated measures factor or grouping variable. Identify the 
repeated measures factor with WITHIN or grouping variables with EFFECT. 


/ ADJDIFF 
DEVIATION[c] 


SIMPLE[c] 
HELMERT 
RHELMERT 
POLYNOMIAL 


ORDER=n 


METRIC2m,n,... 


SUM 


Examples: 


WITHIN 'TIME' 


compares each level of the factor with its adjacent level. 


compares the mean of each level of the selected categorical variable 
(excluding reference level c) to the mean of all the levels (grand mean) 


compares the mean of each level to the mean of a specified 
reference level c 


compares the mean of each level of the selected factor to the mean of 
subsequent levels. 


compares the mean of each level (except the first) to the mean of 
previous levels. 


generates orthogonal polynomial comparisons. The default is equally 
spaced polynomials. 


a polynomial of order n. The default is all orthogonal polynomials. 


generates unequally spaced orthogonal polynomials. Specify one 
number per level. 


computes and tests the sum of the levels. 


CONTRAST / POLY, METRIC = 2,4,8,16 


CONTRAST [-3 -1 1 3] 
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SPECIFY command 


SPECIFY hypothesis language / options 


Automatically codes A, C, and D matrices. The syntax allows for coefficients, 
addition, subtraction, multiplication of coefficients, equality, and constants: 


-3*DRUG[1] - 1*DRUG[2] + 1*DRUG[3] + 3*DRUG[4] = 0 


With the effects model, effects are contrasted. With the means model, means are 
contrasted. 


Example: 


SPECIFY DRUG[1] DISEASE[2] = ОКОС[3] DISEASE[2] 


AMATRIX command 


AMATRIX [matrix] 


Specifies a design matrix for your hypothesis (for between-subjects effects). 


Example: 


AMATRIX [010-100 1] 


CMATRIX command 


CMATRIX [matrix] 
Specifies a design matrix for the dependent variables. 
Example: 


CMATRIX [-1 -1 -1 1 1 1] 
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DMATRIX command 


DMATRIX [matrix] 


Identifies the value of the null hypothesis. The default value is 0 for each effect. 


Example: 


DMATRIX [10 15] 


ERROR command 


ERROR [matrix] 
Specifies error sum of squares and cross-products matrix. 
Example: 


ERROR [23.54 2.35; 2.35 45.62] 


POST command 


POST grpvar 


Performs all pairs comparison of mean vectors for the selected grouping variable. 
Example: 


POST GENDER$ 


PAIRWISE command 


PAIRWISE factorname / BONFERRONI or SIDAK 


Performs Post hoc test for repeated measures. 
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TEST command 


TEST 


Initiates the test of your hypothesis. 
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MATRIX 


MATRIX 


Most of the classical analyses provided by statistical software packages can be 
expressed using matrix algebra, and thus computed in SYSTAT’s MATRIX procedure. 
Students will find MATRIX useful for gaining an understanding of matrix algebra and 
the interworkings of statistical computations. Researchers can prototype state-of-the- 
art procedures before they appear in statistics packages, and also execute complex data 
management tasks. 


Setup: 


HOT 
HOT 
HOT 
HOT 
HOT 
HOT 
HOT 


HOT 
HOT 
HOT 
HOT 
HOT 
HOT 
HOT 
HOT 
HOT 


MATRIX 


USE 

MAT name = matrix expression 
MAT name = mat_name (ref to rows, ref to columns) 
MAT result = function(argument) 
ROWNAME 

COLNAME 

SHOW 

FORMAT 

CLEAR 

CLEAR MATRIX 

MDELETE COLUMN 

MDELETE ROW 

MSELECT 

DIRECTORY 

LET result = function(varname) 
CALL EIGEN, CHOL, QRD, SVD 
MSAVE 
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Entering and defining matrices 


MAT command (matrix creation) 


MAT name=matrix expression 


Creates a matrix named name from matrix expression where the latter can be: 


W ап expression: 

MAT CPD-TRP(X-COLMEAN(X)) * (X-COLMEAN(X)) 

m values you type—for example, for (2 x 3) matrix A, type: 
MAT A = [2 4; 3 1; 1 5] 


m submatrix (see next entry) 
m the name of a current matrix 


MAT command (submatrices) 


MAT name=mat_name(ref to rows; ref to columns) 


Extracts submatrix name from matrix mat_name. 


ref to rows is: 

m а list of row numbers 

m a list of row names 

m acondition involving column names (for example, AGE>21) 
ref to columns is: 

m а list of column numbers 

m a list of variable names or column names 


m acondition involving row names 
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USE command 


USE патеї 


Reads the matrix name1 from the file name1.SYZ. 


/ MATRIX=name2 reads the matrix from the file name1.SYZ and 
names it name2 . 
MTYPE=NUMERIC or STRING reads all numeric or string variable(s) as a matrix. 
By default SYSTAT reads only numeric columns. 
ROWNAME-var or var$ uses var (or var$) to name the rows of matrix. 
COLNAME-var or var$ uses var (or var$) to name the columns of matrix. 
ROWNAME command 


ROWNAME matrix=name1, name2,... 


Names the rows of the matrix named matrix, name1, name2, etc. 
Example: 


ROWNAME GRADES = CINDY JEFF MATT 


COLNAME command 


COLNAME matrix=name1, name2, ... 
Names the columns of the matrix named matrix, name1, name2, etc. 
Example: 


COLNAME GRADES = QUIZ1 QUIZ2 MIDTERM 


235 
MATRIX 


SHOW command 


SHOW matrix1, matrix2, ... 


Prints the specified matrices to the current output device. The default is the terminal 
screen. 


FORMAT command 


FORMAT m, n 


Formats each field in a matrix: m is number of character spaces in each field; n is the 
number of digits following the decimal point (0 <= n <= 9). The default is 12 form, 3 
for n. You can specify n alone. 


/ UNDERFLOW prints in exponential notation tiny numbers that otherwise would appear as 
0. 


MSAVE command 


MSAVE matrix 


| MSAVE name1 saves the matrix name? in a file ‘name1.SYZ’. 
MSAVE name 2/ MAT=name1 saves the matrix пате? in a file * name2.SYZ. 


CLEAR command 


CLEAR namef1, name2, ... 


Removes the specified matrices from memory. 
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MATRIX 


CLEAR MATRIX command 


CLEAR MATRIX = патеї, name2, ... 


Removes the specified matrices from memory. 


MDELETE ROW command 


МОЕ! ЕТЕ ROW = list 


Deletes rows of the active matrix specified in /ist. list can contain ROWNAMES, row 
indices, or a range of rows (for example, 10 .. 31). 


MDELETE COLUMN command 


MDELETE COLUMN = /ist 


Deletes row ofthe active matrix specified in list. list can contain variable (COLNAMES) 
names, column indices, or a range of columns (for example, age .. income). 


MSELECT command 


MSELECT condition 
Uses only those rows of the active matrix that meet the specified condition. 
Example: 


MSELECT HEIGHT » SQR(WEIGHT) 
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DIRECTORY command 


DIRECTORY 
DIRECTORY matrix, matrix2, ... 


Prints information about all matrices known to MATRIX, or about specified matrices. 


MLET command 


MLET col = expression 


Assigns the value of expression to the variable/column col. You can use either numeric 
or string variables. String values must be enclosed in apostrophes or quotation marks. 


Example: 


MLET SQRT_INC = SQR(INCOME) 


Functions 


MAT command (functions) 


MAT result = function(argument) 


Creates a matrix named result from the specified function. 


CALL command 


CALL name(name1 name2 ... namen) 


The CALL command is used to request computations that yield more than one matrix. 
The command uses the routine name. The output from applying name to namen is 
saved as name7, named, etc. 
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MATRIX 
Examples: 
CALL EIGEN (EIGEN_VALUES, computes the eigenvectors and 
EIGEN_VECTORS, INPUT_MATRIX __ EXPRESSION) eigenvalues of a matrix. 
CALL QRD (Q_MATRIX, R_MATRIX, computes the QR decomposition of 
INPUT_MATRIX EXPRESSION) a matrix. 
CALL CHOL (L MATRIX, D MATRIX, computes the Cholesky decomposition 
INPUT MATRIX | EXPRESSION) of a matrix. 
CALL SVD (U MATRIX, D MATRIX, V MATRIX, computes the Singular Value 
INPUT. MATRIX |. EXPRESSION) decomposition of a matrix. 


Generating matrices and forming submatrices 


MAT result = function (argument) 


Function (argument) Result 

(п) an identity matrix of order п. 

M(n p num) an (п x p) matrix filled with num. If num is omitted, the matrix 
is filled with zeros. num can be an expression. 

FILL(matrix num) values of matrix that are missing are filled with num. num can 
be an expression. 

DIAG(matrix) the diagonal elements of matrix as a row vector. 

DIAG3(matrix) the diagonal and the elements above and below it as a rectangular 
matrix with 3 rows and as many columns as elements on the 
diagonal. 

GAID(vector) the diagonal matrix from the specified vector. 


COMPLETE(matrix) rows with missing data are removed from matrix. 
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Manipulating matrices 


MATRIX 


MAT result = function (argument) 


Function (argument) 
TRP (matrix) 


ROWSORT (matrix,colnum) 


COLSORT (matrix, rownum) 


FOLD(sqrmat) 


SHAPE(matrix,m,n) 


DIM(matrix) 
NROW(matrix) 
NCOL (matrix) 
STRING(matrix) or 
STRING(matrix,1) 
STRING(matrix,O) 


GNIRTS(vector) or 
GNIRTS(vector,1) 
GNIRTS(vector,0) 


Result 


transpose of matrix. 

rows of matrix are ordered according to the values in column 
number co/num. 

columns of matrix are ordered according to the values in row 
number rownum. 

the lower triangular portion of sqrmat is copied to the upper 
triangular portion. 

an m x n matrix containing the elements of matrix; that is, 
reshapes the row and column structure. All columns of the 
matrix must be the same type (that is, all numeric or all 
character). 

the number of rows and columns of matrix. 

the number of rows in matrix. 

the number of columns in matrix. 

column vector containing the elements in the lower-triangular 
portion of matrix. 

column vector containing the elements below the diagonal of 
matrix. 

matrix with the elements of vector as the lower-triangular 
portion and missing values above the diagonal. 


matrix with the elements of vector below the diagonal and 
missing values on and above the diagonal. 


|| name? || name? || name3... matrices name1, name2, etc. are concatenated side-by-side. 
Il name? !! name2 lIname3... matrices name, пате?2, etc. are concatenated end-to-end. 
Д name? /| name2 || name3... matrices name1, name2, etc. are concatenated 


Matrix algebra 


Matrix manipulation: 


** or A 


KRON(matrix1 matrix2) 


corner-to-corner. 


matrix multiplication (A*B) 

raise matrix to a power (A**2) 

Kronecker product of two matrices. If matrix is (n x p) and 
matrix2 is (p x 5), the result has (n x p) rows and (s x p) col- 
umns-—each (p x 5) submatrix is the product of its element in 
matrix with every element in matrix2. 
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The inverse, determinant, and trace 


MAT result = function(argument) 


Function (argument) Result 
INV(matrix) the inverse of matrix 


SWEEP(matrix,vector) ^ pivot on some diagonal elements of matrix defined by vector 
(а “1” means pivot, a ^0" means no pivot) 


DET(matrix) the determinant of matrix 

LOGDET (matrix) the log of the determinant of matrix 

TRACE (matrix) the trace (sum of the elements on the diagonal of matrix) 
Eigenvalues 


MAT result = function(argument) 


Function (argument) Result 

EIGVAL (matrix) the eigenvalues of matrix 
Using CALL: 

CALL function (arguments) 

Function (arguments) Result 


EIGEN(vals,vects,matrix) the eigenvalues (vals) and eigenvectors (vects) of matrix 
Solving a system of linear equations 


MAT matrix = function(argument) 


Function (argument) Result 
SOLVE x of the equation system 
(coefmat,constmat) constmat = x * coefmat 


QR, SVD, and Choleskey decomposition 
MAT result = function(argument) 
Function (argument) Result 


CHOL(matrix) the matrix L, found by decomposing the symmetric matrix into 
L*L', where L has zeros above the diagonal. 
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Using CALL: 
CALL function(arguments) 


Function (argument) Result 

CHOL(d_matrix,|_matrix,matrix) е matrices D and L* found by decomposing matrix into 
L**DeL*', where L* has zeros above the diagonal and D 
is 0 everywhere but the diagonal. 

QRD(q matrix, matrix, matrix) the matrices Q and R, found by decomposing matrix into 
QR, where Q'*Q-1, and R is 0 below the diagonal. 

SVD(u matrix,d matrix,v matrix, the matrices U, D, and V, found by decomposing matrix 

matrix) into U*D*V, where D is 0 except for its diagonal, and 
Q*Q'-1 and V*V'-I. 


Transformations 
Arithmetic operators 
t addition # multiplication HH power (A##3 = A#A#A) 
= subtraction / division = change of sign 


Relational and logical operators (result is a matrix of 1° (true) or 0’s (false)) 


Safes equal <> not equal 

= less than AND Boolean and 

<= less than or equal OR Boolean or 

d greater than NOT Boolean negation 


ini greater than or equal 
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Transformations 
MAT result = function(argument) 


Function 


(argument) Result 

SQR (matrix) square root 
LOG (matrix) log base e 

L10 (matrix) log base 10 
EXP (matrix) exponentiation 
LGM (matrix) log gamma 
ABS (matrix) absolute value 
INT (matrix) integer part 
SIN (matrix) sine 


Statistical functions 


MAT result = function(argument) 


Function (argument) 


ROWMEAN(matrix) COLMEAN((matrix) 


Function (argument) Result 
COS (matrix) cosine 
TAN (matrix) tangent 
TNH (matrix) hyperbolic tangent 
ASN (matrix) arc sine 
ACS (matrix) arc cosine 
ATN (matrix) arc tangent 
ATH (matrix) arc tangent hyperbolic 
AT2 (matrix arc tangent 
Result 


mean of values in each row or column 


ROWSTD((matrix) | COLSTD((matrix) Ми deviation of the values in each row or 
column 

ROWSUM((matrix) COLSUM((matrix sum of values in each row or column 

ROWMIN((matrix) | COLMIN((matrix) ^ minimum value in each row or column 

ROWMAX((matrix) COLMAX((matrix) ^ maximum value in each row or column 

ROWMIS((matri) ^ COLMIS((matrix) ^ number of missing values in each row or column 

ROWNUM((matrix) COLNUM((matrix) amber of nonmissing values in each row or 
column 

ROWZSC((matrix) COLZSC((matrix) z scores for each element relative to its row or 


ROWRANK((matrix) COLRANK((matrix) 


CORR( (matrix) 
COVA( (matrix) 
SSCP((matrix) 
EQUAL (matrix1 ,matrix2) 


column. For each element, x, the row z score is 
computed by subtracting the row mean and divid- 
ing this difference by the row standard deviation. 
The column z scores are computed similarly. 


ranks of the values in each row or column 
correlation matrix for columns of matrix 
covariance matrix for columns of matrix 
cross-products of deviations for columns of matrix 
1 if matrix1= matrix2, 0 if matrix1 = matrix2 
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Design variables 


Function (argument) Result 

DESIGNO (vector) “0, 1" design variables 

DESIGN (vector) “1, 0, —1" design variables 

DESIGNF (vector) full-rank design variables 

ORTHEQ (vector) equally spaced orthogonal components 


ORTHUN (vector,spacing) unequally spaced orthogonal components 
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MDS 


Multidimensional Scaling 


MDS offers nonmetric multidimensional scaling of a similarity or dissimilarity matrix 
in 1 to 5 dimensions. Multidimensional scaling is a powerful data reduction procedure 
that can be used on a direct similarity or dissimilarity matrix or on one derived from 
rectangular data with CORR. SYSTAT provides three MDS loss functions (Kruskal, 
Guttman, and Young) that produce results comparable to those from three of the major 
MDS packages (KYST, SSA, and ALSCAL). All three methods perform a similar 
function: to compute coordinates for a set of points in a space such that the distances 
between pairs of these points fit as closely as possible to measured dissimilarities 
between a corresponding set of objects. 

The family of procedures called principal components or factor analysis is related 
to multidimensional scaling in function, but multidimensional scaling differs from this 
family in important respects. Usually, but not necessarily, multidimensional scaling 
can fit an appropriate model in fewer dimensions than these other procedures. 
Furthermore, if it is implausible to assume a linear relationship between distances and 
dissimilarities, multidimensional scaling nevertheless provides a simple dimensional 
model. For more information, see Borg and Lingoes (1981, 1987), Carroll and Arabie 
(1980), Davison (1983), Green and Rao (1972), Kruskal and Wish (1978), Schiffman, 
Reynolds, and Young (1981), and Sheppard, Romney, and Nerlove (1972). 

MDS also computes the INDSCAL individual differences multidimensional scaling 
model (Carroll and Chang, 1970). The INDSCAL model fits dissimilarity/similarity 
matrices for multiple subjects into one common space, with jointly estimated weight 
parameters for each subject (that is, a dissimilarity matrix is input for each subject and 
separate (monotonic) regression functions are computed). MDS can fit the INDSCAL 
model using any of the three loss functions, although we recommend using Kruskal’s 
STRESS for this purpose. 

Finally, MDS can fit the nonmetric unfolding model (Coombs, 1964). This allows 
you to analyze rank-order preference data. 


245 
MDS: Multidimensional Scaling 


Setup: 


* MDS 
* USE 
* — MODEL 
CONFIG 
SAVE 
HOT * ESTIMATE 


MODEL command 


MODEL varlist 
Specifies variables to scale. If you omit varlist, all variables are used. 
/ ROWS=n number of rows for a RECT matrix. The default is the number of 


cases in the working data file. 
SHAPE- SQUARE specifies the type of matrix input, the default is SQUARE. 


CONFIG command 


CONFIG LAST 

or 

CONFIG [matrix] 

Specifies a starting configuration for the scaling. There must be as many rows as items 


and columns as dimensions. The LAST argument allows you to reuse the configuration 
from the previous scaling. 
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SAVE command 


SAVE filename 


Saves the matrix of interpoint distances or the final configuration. You must enter 


SAVE before ESTIMATE. 

Г CONFIG the final configuration. 
DIST matrix of distances between points in the final scaled configuration. 
RESID data (DATA), distances (DIST), estimated distances (DHAT), residuals 


(RESIDUAL), and the ROW and COLUMN number of the original 
distances in a rectangular SYSTAT file (each entry from the original 
matrix is now a case). 


ESTIMATE command 


ESTIMATE 


Initiates the computations. 


/ LOSS=GUTTMAN 
KRUSKAL 
YOUNG 


DIM=n 
R=n 


ITER=n 


REGRESS=MONO 
LINEAR 


LOG 
POWER 


GUTTMAN is the Guttman coefficient of alienation scaling method. 
KRUSKAL, the default, specifies Kruskal's STRESS formula | scaling 
method. YOUNG specifies Young’s SSTRESS scaling method, which 
allows you to scale using the loss function featured in ALSCAL. 
number of dimensions in which to scale. n must be a positive integer 
less than or equal to the number of variables that you scale and 5. 

The default value of dimension is 2. 

constant for the Minkowski (power) metric for computing distances. A 
constant of | is the city-block metric. The default value of R is 2 (the 
Euclidean distance). 

maximum number of iterations allowed before termination. 

The default number of iterations is 50. 

specifies the form of the function relating distances to similarities (or 
dissimilarities). Available only with KRUSKAL or YOUNG, MONO, the 
default, specifies nonmetric scaling, LINEAR specifies metric scaling. 
fit E[y] = a + b In(x) 

fit E[y] = х^р 
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WEIGHT 


SPLIT= ROW 
MATR 


TRIX 


CONVERGE-n 


MDS: Multidimensional Scaling 


adds weights for each dimension and each matrix (subject) into the cal- 
culation of separate distances which are used in the minimization. 
splits calculation of the loss function by rows of the matrix or by 
matrices. 

sets the convergence criteria. This is the largest relative change in any 
coordinate before iterations terminate. The default value of conver- 
gence criteria is 0.005. 
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MISSING: 
Missing Value Analysis 


The MISSING module displays and analyzes missing value patterns in data. An EM 
algorithm generates maximum likelihood estimates of correlation, covariance, and 
cross-products of deviations matrices. For robust ML estimates where outliers are 
downweighted, the user can specify the degrees of freedom for the t distribution or the 
contamination for a normal distribution. Alternatively, linear regression can be used 
for imputation of missing values. 

Output includes a frequency table of missing value patterns, Little's MCAR test, 
estimated means and correlations, and a pairwise frequency table. Estimated 
correlation, covariance, and SSCP matrices can be saved for further analyses. You can 
also save the raw data with the missing entries replaced with imputed values. 


Setup: 
* MISSING 
* USE 
* — MODEL 
SAVE 
PLENGTH 
HOT* ESTIMATE 
MODEL command 


MODEL varlist 


Specifies numerical variables to be analyzed. Categorical variables should contain 
complete data. To use a categorical variable to impute missing values, dummy code the 
variable and include the resulting indicators in varlist rather than the original variable. 

For a description of dummy coding, see “Linear Models III: General Linear 
Models" in Statistics II. 
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SAVE command 


SAVE filename 


Saves the next matrix requested into filename.SYD. SYSTAT automatically assigns the 
saved file the corresponding TYPE (CORR, COVARIANCE, or SSCP). 


/ DATA instead of saving a matrix, saves a rectangular file containing the raw data with 
missing values replaced by imputed values. 


PLENGTH command 


PLENGTH LONG 


PLENGTH LONG gives the mean of each variable. In addition, for EM estimation, 
LONG prints an iteration history, missing value patterns, Little's MCAR test and mean 
estimates. 


ESTIMATE command 


ESTIMATE 
Tells SYSTAT to analyze the variables specified in MODEL. 


/ MATRIX=SSCP _ matrix to be computed. CORR computes the Pearson correlation 
COVAR matrix. COVAR computes the covariance matrix. SSCP computes the 
CORR sums of squared deviations from the mean and the sums of cross- 
products of mean deviations for the variables. The default is CORR. 


NORMAL =n1,n2 for ће EM algorithm, use maximum likelihood estimates for 
contaminated multivariate normal samples where n1 is the probability 
of contamination and n2 is the variance of the contamination. 
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Т=ағ 


ITER = п 


СОМУ = п 


REGRESSION 


for the EM algorithm, use maximum likelihood estimates for a t 
distribution sample where df is the degrees of freedom.The default 
is 5. 

maximum number of iterations for computing the estimates. The 
default is 20. 


convergence criterion. If the relative change of covariance entries are 
all less than this value, then convergence is assumed.The default 
is 0.001 


replaces missing values with predicted values from a multiple linear 
regression. SYSTAT treats a variable with missing values as the 
response, with all other variables specified on the MODEL command 
acting as predictors. 
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MIX: 
Mixed Regression 


MIX estimates regression models containing combinations of fixed and random effects 
for response data having a normal distribution. These models assume a data structure 
in which observations having a common characteristic form identifiable groups, 
resulting in nesting of the observations. MIX uses random effects to account for 
dependencies in the data due to the nesting structure. 

The model for this mixed-effects regression is: 


у= ХВ +20 += 


where у corresponds to the response vector, X equals the design matrix for fixed 
effects, Z is the design matrix for random effects, B and v are vectors of regression 
parameters, and € is a vector of residuals. Estimation of the parameters involves 
computing a marginal maximum likelihood solution using both the EM algorithm and 
Fisher scoring. See Hedeker and Gibbons (1996) for details. 

Mixed regression can be applied to both clustered and longitudinal data. In the 
clustered, or cross-sectional, context, observations from different subjects are nested 
within a larger group, such as students within classes. As a result, random effects 
represent differences between the clusters. Similarly, for longitudinal data, 
observations are nested within each subject. In this case, random effects represent 
differences between subjects. 

MIX, ANOVA, and GLM can all be used for repeated measures analysis. However, 
unlike the other two procedures, MIX analyzes unbalanced data, allowing subjects to 
vary on how many measurements were taken or on when the measurements were 
observed. In addition, a variety of autocorrelation structures account for relationships 
in the residuals across measurement occasions. 

For the specified model, SYSTAT reports parameter estimates for all effects, as 
well as variance and covariance estimates for the random effects. You can save design 
matrices, empirical Bayes estimates of the random effects, and predicted values with 


residuals in data files for further analysis and plotting. 
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Setup: 
* MIX 
USE 
RESET 
CENTER 
CONVERT 
Е MODEL 
RANDOM 
IDENTIFIER 
AUTO 
CATEGORY 
PLENGTH 
SAVE 
HOT * ESTIMATE 
RESET command 
RESET 
Clears all MIX specifications from memory, returning all commands to their default 
state, 
CONVERT command 


CONVERT newvar = varlist 


Wraps two or more variables identified in varlist under the new variable newvar, 
converting multivariate data into the hierarchical structure required by MIX. If there are 
t variables in varlist, the converted data contains / records for the variable newvar for 
each case in the original data set. 

CONVERT creates two new variables, CASE and TRIAL, in the converted data. 
CASE contains the case number from the original data and TRIAL contains the variable 
number (1, 2, ..., £) corresponding to the position of the variable in varlist. If the 
variables CASE and TRIAL already appear in the original data set, the names for the 
created variables end in an underscore and the first integer resulting in a unique name, 
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such as SUBJECT 1 or TRIAL_2. Variables not specified in varlist become constants 
across each associated set of t observations in the new data set. 
For example, if the original data are: 


TI T2 T3 G 
47.8 48.8 49.0 1 

46.4 413 47.7 0 
submitting: 


CONVERT SCORE = T1 T2 T3 
yields the following data structure: 


SCORE SUBJECT TRIAL 


47.8 
48.8 
49.0 
46.4 
47.3 


G 
1 
1 
1 
0 
0 
47.7 0 


NNN — 
о ә € WN — 


Notice that the conversion eliminates variables specified in varlist from the data set. As 
a result, none of the variables appearing in varlist may be used for other MIX commands; 
use the variable names for the converted data. Because the software processes 
commands sequentially and variables specified on other commands depend on the 
variables in the converted file, CONVERT should appear before commands requiring 
variable designations. 

Usually, newvar corresponds to the dependent variable for the MODEL command 
and CASE corresponds to the id variable specified by IDENTIFIER. For longitudinal 
data, TRIAL represents time and can be used as either a fixed or random effect. 


Note: The behavior of CONVERT corresponds to WRAP with one notable difference: 
CONVERT does not modify the active data file. The software performs the conversion 
during the MIX computations, leaving the active data file unchanged. To permanently 
change the structure of the data, use WRAP instead of CONVERT. 
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MODEL command 


MODEL depvar 
MODEL depvar = INTERCEPT + varlist 


Specifies the dependent variable and the fixed effects, if any, for the linear model to be 
estimated. The dependent variable, depvar, must be numeric. Replace varlist with a list 
of numeric or string independent variables separated by plus (^) signs. Specify 
interactions by linking variables with an asterisk (e.g. var1*var2). INTERCEPT is an 
optional parameter corresponding to a constant effect across all groups. 

For models containing no fixed effects, use MODEL to identify the dependent 
variable only. Specify random effects using the RANDOM command. 


RANDOM command 


RANDOM INTERCEPT + varlist 


Lists effects that vary across groups. Replace varlist with a list of numeric or string 
independent variables separated by plus (+) signs. Specify interactions by linking 
variables with an asterisk (e.g. var1*var2). INTERCEPT is an optional parameter 
corresponding to an effect that is constant within each group but varies across groups. 

Any variable can be specified as fixed (using MODEL) or random (using RANDOM). 
However, MIX fits a variable specified as random as both a random effect and a fixed 
effect. As a result, models containing effects that are random but not fixed cannot be 
estimated. 

Use the IDENTIFIER command to specify the variable denoting the groups over 
which the random effects vary. 


IDENTIFIER command 


IDENTIFIER varname 


For models containing random effects, denotes a numeric or string variable that 
identifies group membership; the individual observations are nested within levels of 
this variable. For clustered data, such as students nested within schools, a variable with 
values reflecting the school for each student should be assigned as the identifier 
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variable. In longitudinal data, observations are nested within subjects, and the 


identifier corresponds to a variable denoting the subject ID. 


The identifier variable should be a categorical numeric or string variable. For each 
level of varname, MIX estimates a regression parameter for each random effect defined 


by RANDOM. 


AUTO command 


AUTO varname 


For longitudinal data, includes an autocorrelation structure for error in the mixed effect 
model. Varname is the name of a numeric variable representing time from which 
autocorrelated errors are to be generated. Typically, values of varname correspond to 


minutes, days, or weeks. 


| TYPE= AR type of process used to model the autocorrelation structure. Valid 
NAR processes include: stationary first-order autoregressive (AR); non- 
MA stationary first-order autoregressive (NAR); stationary first-order 


ARMA moving average (MA); stationary first-order autoregressive, moving 
GEN average (ARMA); and general Toeplitz structure (GEN). The 


default is AR. 


NUMBER = n number of autocorrelation terms for the general autocorrelation 
process (TYPE = GEN). Enter a number greater than 0, but less 


than the maximum number of timepoints. 


Fix-r1,72, Т: fixes the autocorrelation terms instead of estimating them. For 
TYPE - AR, MA, and NAR, enter a single value. For ARMA, enter 
two values. For GEN, enter the number of values indicated by 


NUMBER. 
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CATEGORY command 


CATEGORY varlist 


Specifies numeric or string grouping variables that define cells. Specify for all 
categorical variables for which MIX should generate design variables. 


| MISS 
EFFECT 
DUMMY 
PLENGTH command 


for each categorical variable, adds a category into which cases with a 
missing value get classified, allowing cases having missing values on 
the variable to be included in the analysis. 


produces parameter estimates that are differences from group means. 
In the absence of a specified coding technique, the software generates 
effect codes. 


produces dummy codes for the design variables instead of effect 
codes. 


PLENGTH NONE 


SHORT 
MEDIUM 
LONG 


Controls the amount of output reported. To request extended results, enter PLENGTH 
prior to ESTIMATE. Select one of four categories of output: 


NONE 


SHORT 


MEDIUM 


LONG 


no text output. Use PLENGTH NONE in combination with SAVE to gener- 
ate data files containing statistics from MIX without displaying the results. 


descriptive statistics, starting values, regression parameter estimates, ran- 
dom effect variances and covariances, and correlations for the fixed and 
random effects. 


SHORT output plus empirical Bayes estimates of the parameters for the 
random effects. 


MEDIUM output plus the iteration history and variances and covariances 
for the Bayes estimates. 

Note: Each level of the IDENTIFIER variable yields a separate covariance 
matrix for the random effects. If your IDENTIFIER variable has many lev- 
els, LONG is an understatement of how much output you will receive. 


257 


SAVE command 


MIX: Mixed Regression 


SAVE filename 


Saves specified statistics to a file. 


| DATA 
BAYES 
RESID 

Example: 


saves the variables for the current model. The software saves categorical 
variables using effect or dummy codes, assigning root names of FXD or 
RND, depending on whether they appear as ixed or random effects. After 
the root, the name continues with an integer corresponding to the position 
of the variable in the MODEL or RANDOM commands. A subscript 
denotes the level coded by the variable. 


saves empirical Bayes estimates, as well as posterior variances and cova- 
riances for the estimates. 


for the average line or surface. RES] and PRED] equal the residuals and 
predicted values for the line or surface for each level of the IDENTIFIER 
variable. L2RESn equals the residual for each random effect, where п 
indicates the position of the effect on the RANDOM command. 


SAVE MIXOUT / RESID 
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ESTIMATE command 


ESTIMATE 
Causes the previously specified model to be estimated. 


lists the data for the first r levels of the IDENTIFIER 

| NREC=r variable. Use with the CONVERT command to view 
the results of the transformation from multivariate to 
hierarchical data. The default is 0. 


NEM=m number of EM iterations executed before Fisher scor- 
ing algorithm begins. The default is 10. 


convergence criterion. This is the largest relative 
CONV = с change in any parameter estimate before iterations 
cease. The default is 0.0001 


during Fisher scoring, reparameterizes the variances 
REPAR = ON using an exponential transformation to avoid estima- 
OFF tion difficulties due to variances near 0. By default, 
REPAR is ON. 
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MIXED: 
Linear Mixed Models 


Linear Mixed Models (LMM) fits and analyzes mixed models with structured 
covariance/correlation matrices for random effects and residuals. Variance 
Components, Compound Symmetry, Diagonal, and Unstructured are the four types of 
structures provided for random effects. Variance Components, Compound Symmetry, 
and Auto-Regressive(1) structures are provided for error correlations. Various models 
like random intercept model, random coefficients model, variance components model, 
mixed effects ANOVA model, and models with autocorrelated errors can be fitted 
using LMM. LMM allows random effects to be both categorical and continuous. In 
Hierarchical Linear Mixed Model (HLMM), SYSTAT provides two methods to 
estimate covariance parameters, viz., Maximum Likelihood and Restricted (Residual) 
Maximum likelihood. 


Setup: 
* MIXED * HYPOTHESIS 
* USE FMATRIX [matrix] 
* RESET, RMATRIX [matrix] 
* — MODEL DMATRIX [matrix] 
CATEGORY PAIRWISE 
PLENGTH HOT * TEST 
RANDOM 
REPEATED 
SAVE 
HOT* ESTIMATE 
Model Building 


RESET command 


Clears all MIXED command specifications from memory, returning all commands to 
their default state. 
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MODEL command 


MODEL var1 = INTERCEPT + varlist1 + var2*var3 + var4(var5) 


Specifies the linear model to estimate. var is the dependent variable. Specify the 
model with intercept by INTERCEPT or CONSTANT. Specify interactions by linking 
variables with an asterisk. The parentheses denote nested factors. 


Examples: 


MODEL Y = INTERCEPT + A + X*X + B(A) 
MODEL Y = GENDER+GENDER*COUNTRY$ 


MODEL Y = A*B*C 


CATEGORY command 


CATEGORY grpvariist 
Specifies numeric or string grouping variables that define cells. 


/MISS allows cases with a missing value for the categorical 
variable to be included in the analysis. 


PLENGTH command 


PLENGTH SHORT 
MEDIU 
LONG 


MIXED produces extended output if you set the output length to LONG. For model 
estimation, extended output adds BLUEs if fixed effects and BLUPs of random effects. 
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RANDOM command 


RANDOM varlist 

RANDOM var1*var2 

RANDOM var3(var4, 

RANDOM INTERCEPT 

RANDOM INTERCEPT-varlist*vart *var2+var3(var4) 


Specifies the random effects of the model. If you specify multiple random effects in 
one RANDOM statements (separated by ‘+’, * *, or *,") MIXED treats them as а single 
random effect unless the covariance structure is VCOMPONENTS. You can use multiple 
RANDOM statements if you want to specify more than one random effect. For a 
specified random effect following options are available: 


/ STRUCTURE = VCOMPONENTS specifies ol, p x p Structure for the random effect 
covariance matrix, where p is the number of levels 


of the random effect. 


CSYMMETRY specifies б pt Б *.Ј р structure for the random 
effect covariance matrix, where J is the matrix of 1’s. 

DIAGONAL specifies diag(Gj,, б}у,...› Sip) structure for the 
random covariance matrix. 

UNSTRUCTURED specifies a general symmetric structure for the random 
effect. i.e., the number of parameters to be estimated 
is р(р+1)/2 

SUBJECT = term specifies grouping factor for the random effect. term can 
be nested or crossed factors, or simply a variable. 

MEANS foe VCOMPONENTS structure, it forces MIXED to 
estimate a common variance parameter for all random 
effects. 

Examples: 


RANDOM INTERCEPT * A * B 
RANDOM INTERCEPT * A /SUBJECT = B 


RANDOM A B /SUBJECT = ID STRUCTURE = CS 
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REPEATED command 


REPEATED 
Specifies the covariance structure for errors. 


/ STRUCTURE = VCOMPONENTS specifies G^ I structure for within subject errors . 

CSYMMETRY specifies compound symmetric structure for within 
subject errors 

AR(1) specifies auto-regressive covariance structure of order 
1 for within subject errors. 

GROUP - term specifies grouping factor for the errors. term can be 

nested or crossedfactors, or simply a variable. 
Each combination of these factors defines one subject. 


SAVE command 


SAVE filename 


Before ESTIMATE. Saves specified statistics to a file. 


1 MRESIDUALS marginal residuals 
CRESIDUALS conditional residuals 
DATA data from original file along with marginal and conditional residuals 
COVPARAMETERS estimates of covariance parameters 
FIXED Best Linear Unbiased Estimates (BLUE) of fixed effects 
RANDOM Best Linear Unbiased Predictors (BLUP) of random effects 
SERRORFIXED standard errors of fixed effects estimates 
MODEL marginal and conditional residuals, response variable, and the design 
matrices 
Examples: 
SAVE MYFILE / COVPAR 


SAVE MYFILE / MRESIDUALS 
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ESTIMATE command 


ESTIMATE 


Tells SYSTAT to fit the specified model. 


/ METHOD = ML specifies the method of estimation.Avaialable methods are: 
REML 1.Maximum Likelihood (ML) and 
2 Restricted Maximum Likelihood (КЕМІ) 
TYPE = HESSIAN specifies the type of convergence to be checked in order 
LIKELIHOOD to stop iterations. 
PARAMETERS 
CRITERION = RELATIVE specifies the convergence criterion to be used. 
ABSOLUTE 
NEM = n1 specifies the maximum number of EM iterations to be 
performed before continuing to Newton-Raphson 
iterations. 
NNR = n2 specifies the maximum number of Newton-Raphson 
iterations. 
CONVERGENCE = d1 sets the cutoff value for convergence criterion. Iterations 


stop when convergence criterion is less than this cutoff 
value. The default is 1e-8. 


HALF =n3 specifies the maximum number of step-halvings in 
Newton-Raphson iteration. 

TOLERANCE = d2 sets the tolerance for double precision computations. 
The default is 1e-12. 

CONFIDENCE = 44 sets the value for confidence interval for the estimated 


parameters. The default is 0.95. 


GSTART = [g1, 92, --- 9K] specifies the vector of parameters to be used as initial 
estimates of random effects covariance parameters. 
Specify error variance as a last parameter in GSTART. 


RSTART = [r1, r2, .- rl] specifies the vector of parameters to be used as initial 
estimates of error covariance parameters. 
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Hypothesis testing 


HYPOTHESIS command 


HYPOTHESIS 


Tells SYSTAT that you want to test hypotheses on a previous MODEL. You must enter 
HYPOTHESIS before you can use any of the following commands. 


PAIRWISE command 
PAIRWISE varlist 
This performs pairwise equality checks among the coefficients of the specified fixed 
effect. 
BONF Bonferroni comparisons. This is the default method. 
LSD LSD pairwise comparisons. 
TUKEY uses the Studentized range statistic to make all pairwise comparisons. 
SCHEFFE Scheffe pairwise comparisons. 
SIDAK Student's t statistic for pairwise multiple comparisons. 
GT2 uses Studentized maximum modulus distribution 
Example: 
PAIRWISE SEASON / BONF 
PAIRWISE SEASON 
FMATRIX command 
FMATRIX [matrix] 


FMATRIX is matrix of linear weights contrasting the coefficient estimates for fixed 
effects. Specify as many numbers as dimension of your beta vector. 
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Example: 


FMATRIX [1 -1 0 0] 


RMATRIX command 


RMATRIX [matrix] 


RMATRIX is matrix of linear weights contrasting the coefficient estimates for random 
effects. Specify as many numbers as dimension of your gamma vector. 


Example: 


RMATRIX [2 1 -1] 


DMATRIX command 


DMATRIX [matrix] 


D is a null hypothesis vector. By default it is a null vector. 


Example: 


DMATRIX [10 15] 


TEST command 


TEST 
Initiates the test of your hypothesis. 
/| CONFI=n specifies the level of confidence. 


ESTIMATE provides estimate of the estimable linear parametric function, its 
standard error and corresponding t-test. 
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NONLIN: 
Nonlinear Models 


Setup: 


NONLIN estimates parameters for a variety of nonlinear models using a Gauss-Newton 
(SYSTAT computes exact derivatives), Quasi-Newton, or Simplex algorithm. In 
addition, you can specify a loss function other than least squares, so maximum 
likelihood estimates can be computed. You can set lower and upper limits on individual 
parameters. When the parameters are highly intercorrelated, and there is concern about 
overfitting, you can fix the value of one or more parameters, and NONLIN will test the 
result against the full model. If the estimates have trouble converging, or they converge 
to a local minimum, Marquardting is available. 

For assessing the certainty of the parameter estimates, NONLIN offers Wald 
confidence regions and Cook-Weisberg graphical confidence curves. The latter are 
useful when it is unreasonable to assume that the estimates follow a normal 
distribution. You can also save values of the loss function for plotting contours in a 
bivariate display of the parameter space. This allows you to study the combinations of 
parameter estimates with approximately the same loss function values. 

When your response contains outliers, you may want to downweight their residuals 
using one of NONLIN's robust y functions: median, Huber, Hampel, bisquare, t, trim, 
or the pth power of the absolute value of the residuals. 


* NONLIN 
ў USE 
pi MODEL 
LOSS or ROBUST 
RESET 
PLENGTH 
FUNPAR 
SAVE 
HOT * ESTIMATE 
FIX 
HOT ESTIMATE 


If convergence is not achieved, type ESTIMATE and iterations will continue. 
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MODEL command 


MODEL var = function 


Specifies a general algebraic equation model to be estimated. Terms that are not 
variables are assumed to be parameters. 


Examples: 
MODEL RESPONSE = CONSTANT + BETA'X 


MODEL Y = A*(1-EXP(B*X))*SIN(C*X) 


LOSS command 


LOSS function 


Specifies a loss function to apply in model estimation. The default is ordinary least 
squares. Maximum likelihood estimation, for example, can be accomplished by 
specifying negative log-likelihood with LOSS. You can use SYSTAT's variable 
ESTIMATE (the fitted values from your MODEL) in the loss function if it is not a 
variable in your SYSTAT file. For maximum likelihood estimation, see SCALE under 


ESTIMATE. 


Examples: 
LOSS (Ү-ЕЅТІМАТЕ)^2 
LOSS ABS(Y-ESTIMATE) 


Note: 

You can put constraints on models or loss functions. For example: 
MODEL Y= (ABS(R)<3.141 59)*SIN(R/BETA) 

LOSS (THETA>0 AND THETA<1)*(Y-ESTIMATE) + ... 

You can also use LOSS to select а subset of the cases. For example: 


LOSS (САЅЕ<1 00)*(Y-ESTI МАТЕ)^2 
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RESET command 


RESET dependvar = expression 
weightvar = expression 


The dependent variable or the weight variable can be recomputed after each iteration, 
using the current values of the parameters. For one type of maximum likelihood 
solution, the dependent variable can be reset to (ESTIMATE + 1). For iteratively 
reweighted least-squares or maximum likelihood, reset the weight variable. 


Example: 


RESET W = (COUNT / ESTIMATE * (COUNT-ESTIMATE)) 


ROBUST command 


ROBUST argument 


Select a robust y function for downweighting the influence of extreme residuals. Use 
argument to define a specific loss function. The parameters for HUBER, HAMPEL, t, 
BISQUARE, RAMSAY, ANDREWS, and TUKEY are defined in MAD (median deviations 
from the median) unit of the residuals. 
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ABSOLUTE 
POWER=n 
TRIM=n 


HUBER=n 
HAMPEL=n1,n2,n3 


t=df 


BISQUARE=n 
RAMSAY =n 
ANDREWS=n 


TUKEY=n 


Examples: 


ROBUST ABSOLUTE 
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sum of absolute values of residuals (1st power) 

sum of n th power of absolute values of residuals. The default is 1.5. 
n proportion of the residuals (those with the largest absolute values) 
are trimmed. Then, the sum of squares of the remaining residuals is 
minimized. The default is 0.10, the proportion trimmed. 

sum of MAD standardized residuals weighted by HUBER (n). 

The default 181.7. 

sum of MAD standardized residuals weighted by HAMPEL (n1, n2, 
n3). The default is1.7, 3.4, 8.5. 

sum of df/ (df+u? ), where u =residuals / MAD (residuals) and df is 
the degrees of freedom for the f distribution. The default is 5.0; the df 
of t. 

sum of MAD standardized residuals weighted by BISQUARE (n). The 
default is 7.0. 

sum of MAD standardized residuals weighted by RAMSAY(n) 

The default is 0.3. 

sum of MAD standardized residuals weighted by ANDREWS(n) 
The default is 1.339. 

sum of MAD standardized residuals weighted by TUKEY(n) 

The default is 5.5. 


ROBUST HAMPEL-1.5, 3.0, 6 


PLENGTH command 


PLENGTH SHORT 
LONG 


SHORT prints default output. LONG adds Wald type confidence intervals and 
asymptotic standard errors and correlations for parameter estimates. 
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FUNPAR command 


FUNPAR name1=function1 
FUNPAR name2=function2 


Estimates functions of parameters. Assigns a name to each function. You can state as 
many functions as you want. NONLIN evaluates each function and reports related 
statistics. 


Examples: 


FUNPAR LD50=ALPHA/BETA 


FUNPAR SQUARES=P1A2+P2A2 


SAVE command 


SAVE filename 


Saves residuals, estimated values, and the variables from the model statement to a file. 
You can place one command before ESTIMATE and one or more commands after 


ESTIMATE. 
/ DATA data, estimated values, and residuals. 
RESID estimated values, residuals, and variables from the model statement. 


PARAMS parameter values. 


RS=p1,p2 saves five levels of contours of the loss function surrounding the converged 
minimum (like a response surface for the loss function in a 2-D parameter 
space). Specify names of two parameters (p1, p2). If the names are omitted, 
SYSTAT assumes the first two parameters in your model. 

Cl=p1,p2,... data values NONLIN computes for Cook-Weisberg confidence curves plus 
commands to plot the values. Specify names of parameters (p1, p2, ...). You 
can specify a subset of the parameters. 

CR=p1,p2 saves a closed curve that defines the 95% confidence region for a pair of 
parameters surrounding the converged minimum. Specify their names as p1, 
p2. Use CONFI to change the size of the region. Differs from RS in that 
SYSTAT reestimates the whole model when computing each value of the loss 
function (instead of fixing unspecified parameters at their estimated value). 
This option is useful when there are more than two parameters. 


CONFl=n confidence interval for CR, 0 < n < 1.0. 
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ESTIMATE 


Tells SYSTAT to estimate your MODEL. SYSTAT provides three algorithms for solving 
the problem: Gauss-Newton (GN), Quasi-Newton (QUASI), and Simplex (SIMPLEX). 


| GN 
QUASI or QN 


SIMPLEX 
MARQUARDT=n 


START=n1,n2,... 


RESTART 


MIN=n1,n2... 
MAX-n1,n2... 
ITER=n 
HALF=n 


TOL=n 


LCONV=n 


CONV=n 


SCALE 


modified Gauss-Newton method that computes exact derivatives. 

Quasi-Newton method of model estimation that uses numeric estimates 

of the first and second derivatives. 

Simplex method of model estimation that uses a direct search procedure. 

Marquardt method of inflating the diagonal of the (Jacobian' Jacobian) 

matrix by n. This speeds convergence when initial values are far from 

the estimates and when the estimates of the parameters are highly inter- 

correlated. This method is similar to “ridging,” except that the inflation 

factor n is omitted from final iterations. 

starting values for MODEL parameters. Specify values for each param- 

eter in the order the parameters appear in your MODEL statement (or 

LOSS statement if no MODEL is specified). 

The default is various 10th powers of 101, 102, 103,... with or without 

the 

minus sign, depending on the model considered. 

if START values are defined, resets the initial estimates to the specified 

starting values for each BY group or bootstrap sample. 

lower limits for the parameters, one number per parameter. 

upper limits for the parameters, one number per parameter. 

maximum number of iterations for fitting your model. The default is 20. 

maximum number of step halvings. (If RSS increases between two itera- 

tions, the increment size is halved.) The default is 8. 

a check for near singularity when NONLIN inverts the matrix of sums of 

cross-products of the derivatives. The default is 0.0001. 

loss convergence criterion. If the relative improvement of LOSS is less 

than this value, then convergence is assumed. Note, for convergence, 

both LCONV and CONV must be satisfied. (If you only specify one, the 

The defaultThe default criterion for the other must also be satisfied.) To 

turn off LCONV and CONY, use 1.0 or larger. The default is 0.000001. 

parameter convergence criterion. If the largest relative improvement of 
arameters is less than this value, then convergence is assumed for the 

estimate. Each parameter estimate must satisfy this criterion. See also 

LCONV. The default is 0.00001. 

rescales the mean square error to 1 at the end of the iterations. Use for 

maximum likelihood estimation. 


272 
NONLIN: Nonlinear Models 


Examples: 
ESTIMATE / ITER=30 TOL=.001 


ESTIMATE / SIMPLEX START=-1, .5, 10 


FIX command 


FIX p1=n1, р2=п2, ... 


Specifies names of parameters to be held fixed at a constant value (n1, n2, ...). SYSTAT 
estimates the remaining parameters and tests whether the result differs from that for the 
full model. Be sure to specify the full model first; that is, specify ESTIMATE for the full 
model, then FIX... followed by a second ESTIMATE. 
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NPAR: Nonparametric Tests 


Nonparametric Tests 


NPAR computes nonparametric statistics for: 


m testing differences for a single variable across two or more independent groups of 


cases: the Mann-Whitney rank sum test, the Kruskal-Wallis one-way analysis of 
variance, and the Kolmogoroy-Smirnov two-sample test. 

testing differences among two or more dependent variables: the Sign test, the 
Wilcoxon signed rank test, the Friedman two-way ANOVA (or repeated measures) 
test, and the Quade two-way ANOVA test. 

studying the distribution of a single variable: the Kolmogorov-Smirnov one- 
sample test, Anderson-Darling test, and the Wald-Wolfowitz runs test. 


Setup 
*NPAR 
USE 
SAVE 
HOT *  KRUSKAL or AD or KS or SIGN or WILCOXON or FRIEDMAN or QUADE or 
RUNS 
Two-sample 
KRUSKAL command 
KRUSKAL varlist *grpvar 


Computes a Kruskal-Wallis analysis of variance for each variable in varlist. The values 
of a variable are transformed to ranks (ignoring group membership). This test uses 
these ranks to test that there is no shift in the center of the groups (that is, the centers 
do not differ). This is the nonparametric analog of a one-way analysis of variance. If 
grpvar has two levels, the Mann-Whitney statistic is reported. 
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KS command (Two-sample) 


KS varlist *grpvar 


Computes a Kolmogorov-Smirnov two-sample test to measure the discrepancy 
between two sample cumulative distribution functions. The distributions can be 
organized as two variables (two columns) or as a single variable (column) with a 
second variable that identifies group membership. For a test with unequal sample sizes, 
specify one numeric variable followed by a grouping variable, grpvar. Alternatively, if 
sample sizes are equal (after deletion of missing values), specify varlist only and tests 
are performed for every pair of variables (columns) in the list. If you omit varlist, two- 
sample tests are computed using all numeric variables. 


Related variables 


SIGN command 


SIGN varlist 


Computes a sign test on all pairs of variables in varlist to find the difference between 
two variables. If you omit varlist, all numeric variables are used. 


WILCOXON command 


WILCOXON varlist 


Computes a Wilcoxon signed rank test on all pairs of variables in varlist of the 
differences between two variables and is the nonparametric analog of the paired / test. 
If you omit varlist, all numeric variables are used. 
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FRIEDMAN command 


FRIEDMAN varlist = groupvar blockvar 


Performs the Friedman test for each variable іп varlist. If you omit varlist, all numeric 
variables are used. groupvar is the categorical variable representing the treatment 
factor. blockvar is the categorical variable representing the block factor. If groupvar and 
blockvar are omitted, then each variable in varlist is considered as a treatment and each 
row (case) is treated as a block. 


QUADE command 


QUADE varlist = groupvar blockvar 


Performs a Quade test for each variable in varlist. If you omit varlist, all numeric 
variables are used. groupvar is the categorical variable representing the treatment 
factor. blockvar is the categorical variable representing the block factor. If groupvar and 
blockvar are omitted, then each variable in varlist is considered as a treatment and each 


row (case) is treated as a block. 


| MUTILPLE performs multiple comparisons test among the variables in varlist 


One-sample 


AD command 


AD varlist 


Computes Anderson-Darling test. Specify a distribution to which the variables are 


compared. 
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KS command (One-sample) 


KS varlist 


Computes a Kolmogorov-Smirnov one-sample test. Specify a distribution to which the 
variables are compared. 


/ BENFORD=B Benford’s Law with parameter B 

* BETA=shp1,shp2 beta distribution with two shape parameters Shp1 and shp2 
BINOMIAL=n,p binomial distribution with sample size and probability of success 

* CAUCHY=l/oc,sc Cauchy distribution with location and scale parameters 

* CHISQ=df chi-square with df degrees of freedom 
DUNIFORM=N discrete uniform distribution with parameter N 

* DEXP=loc,sc double exponential (Laplace) distribution with location and scale 

parameters 

* ERLANG- shp,sc Erlang distribution with shape and scale parameters 

* EXP=loc,sc exponential distribution with location and scale parameters 

* Fzdf1,df2 F distribution with df and df2 degrees of freedom 

* GAMMA=shp,sc gamma distribution with shape and scale parameters 
GEOMETRIC=p geometric distribution with success probability 

* GOMPERTZ=b,c Gompertz distribution with parameters b and c 

* GUMBEL-/oc,sc Gumbel distribution with location and scale parameters 
HGEOMETRIC-N,m,n hyper geometric distribution with parameters N,m and n 

* IGAUSSIAN-IoC,sc inverse Gaussian distribution with location and scale parameters 
LILLIEFORS Lilliefors test for normality 
LSERIES-theta logarithmic series with parameter theta 

* LOGISTIC-/oc, sc logistic distribution with location and scale parameters 

* ENORMAL-/oc,sc logit normal distribution with location and scale parameters 

* LLOGISTIC=shp, /og(sc) log logistic distribution with shape and logarithm of scale parameters 

* LNORMAL-/oc,sc log normal distribution with location and scale parameters 
NBINOMIAL=k,p negative binomial distribution with parameters k and p 


* NCCHISQ-df, delta non-central chi-square distribution with parameters df and delta 
* NCF=df1, df2, delta non-central f distribution with df1,df2 degrees of freedom and delta 


* NCT-df, delta non-central t distribution with df degrees of freedom and delta 

* NORMAL -/oc, Sc normal distribution with the location parameter /oc and scale param- 
eter SC 

* PARETO=thr,shp Pareto distribution with threshold parameter thr and shape parameter 


shp 
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POISSON-/ambda Poisson distribution with mean lambda 
* RAYLEIGH-sc Rayleigh distribution with scale parameter SC 
* SEV=loc, sc Smallest extreme value distribution with location and scale 
* SMM-k, df Studentized maximum modulus distribution 
* RANGE=k,df Studentized range with parameters К and df 
* T=df t distribution with df degrees of freedom 
* TRIANGULAR-a,b,c triangular distribution with parameter а (min), b (max) and c (mode) 
* UNIFORM=min,max uniform with minimum and maximum values min, max 
* WEIBULL=sc,shp Weibull distribution with scale and shape parameters 
ZIPF=shp Zipf distribution with shape parameter shp 


* indicates distributions available under Anderson-Darling test 


Note: The Lilliefors test uses the standard normal distribution. The variables that you select are 
automatically standardized, and the test determines whether the standarized versions are normally 
distributed. Lilliefors is not a distribution but is included under ‘distributions’ for convenience. It can 
be used to test normality when the parameters are not specified. 


RUNS command 


RUNS varlist 


Compute a Wald-Wolfowitz runs test on all numeric variables in varlist to detect serial 
patterns in a run of numbers. If you omit varlist, all numeric variables are used. 


| CUT=n a baseline for determining runs. The default is 0. 


SAVE command 


SAVE filename 


Saves the test statistic and p-values to a file. 
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PERMAP: 
Perceptual Mapping 


PERMAP computes coordinates for plotting subjects in perceptual mapping studies. 
PERMAP offers two types of tools. The first is a group of procedures for fitting subjects 
and objects їп а common space. This group includes internal and external unfolding 
models, MDPREF and PREFMAP, as well as BIPLOT, which is a minor modification of 
MDPREF. The second is a set of procedures for relating one dimensional configuration 
to another, generally called PROCRUSTES rotation. Both the orthogonal procrustes 
and the more general canonical rotations are available. 


Setup: 
* PERMAP 
* USE 
* MODEL 
HOT * ESTIMATE 
MODEL command 


MODEL varlist 
or 
MODEL depvarlist = indvarlist 


For MDPREF and BIPLOT, use a single varlist. For PREFMAP and PROCRUSTES 
rotations, use two varlists. 
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ESTIMATE command 


PERMAP: Perceptual Mapping 


ESTIMATE 
Runs the problem. 


/ METHOD= BIPLOT 
MDPREF 


VECTOR 
CIRCLE 
ELLIPSE 


PROCRUSTES 
CANONICAL 


STANDARDIZE 
DIMENSION-n 


POLARITY-POSITIVE 
NEGATIVE 


methods of estimation. The default is BIPLOT. 
a BIPLOT with all vectors of the same unit length 


VECTOR, CIRCLE, and ELLIPSE methods are Carroll 
and Chang's PREFMAP model 


Procrustes and canonical methods rotate two matrices (in 
the two-set MODEL) to maximum congruence. The 
canonical method allows translation, rotation, and dila- 
tion or compression of axes. 


standardizes the data before fitting. 


specifies the number of dimensions to do the scaling. 
The default is 2. 

specifies the polarity of the preferences when doing pref- 
erence mapping. If the smaller number indicates the least 
and the higher number the most, POLARITY-POSITIVE. 
For example, a questionnaire may include the question 
“Rate a list of movies where one star (*) is the worst and 
five stars (*****) is the best." If the higher number indi- 
cates a lower ranking and the lower number indicates a 
higher ranking, POLARITY= NEGATIVE. For example, a 
questionnaire may include the question “Rank your 
favorite sports team where | is the best and 10 is the 
worst." The default is POSITIVE. 
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PLS: 


Partial Least Squares Regression 


PLS provides Partial Least Squares regression technique. The regression could be 
either of univariate multiple regression or multivariate multiple regression. You can fit 
the model by any of the two algorithms, viz., SIMPLS (Straight-forward 
Implementation of Partial Least Squares) and NIPALS (Nonlinear Iterative Partial Least 
Squares). The fitted model can be cross-checked by different cross-validation 
techniques. PLS offers two types of cross-validation techniques, viz., LOUT (leave-one- 
out) and RAN (random exclusion). 


Setup: 
* PLS 
* USE 
* MODEL 
PLENGTH 
SAVE 
HOT * ESTIMATE 
MODEL command 


MODEL y varlist- x varlist 


Specifies the model. More specifically, this command specifies the response variables 
and the predictor variables. There is no need to include any 'Constant' as SYSTAT 
always calculates the intercepts. 


/N=n specifies the number of latent factors to be extracted. 
Example: 


MODEL Y1 Y2=X1 X2..X20 
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PLENGTH command 


PLENGTH SHORT 
MEDIUM 
LONG 


Controls the amount of output reported. Give PLENGTH prior to ESTIMATE. Select any 


one of the following options: 


SHORT estimated values of the coefficients, standard error of the estimated coeffi- 
cients, ANOVA table, percentage of explained variance by the factors. If 


any cross-validation technique is selected, then th 


е value of the PRESS sta- 


tistic (Average PRESS statistic in case of random exclusion) and R“ prediction 


value. 


MEDIUM/LONG X-loadings and Y-loadings along with the SHORT output. 


SAVE command 


SAVE filename 


Save the specified statistics/results to a file. 


/ COEFF saves the estimated regression coefficient 
matrix. 
RESID saves the residuals and the fitted values after 


fitting the regression equation. 
DATA saves the original data. 
SCORE saves the X-scores and Y-scores 
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ESTIMATE command 


ESTIMATE 


Tells SYSTAT to estimate the analysis specified in MODEL. 


1 SIMPLS fits the model by SIMPLS algorithm. 
NIPALS fits the model by NIPALS algorithm. 
CV= LOUT performs cross-validation by leave-one-out technique. 
CV= RAN(r,s) performs cross-validation by random exclusion technique. At 


each step,'s' observations are selected using without replacement 
technique and are excluded. The process is repeated '” times. 
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POSAC: 
Partially Ordered Scalogram Analysis with Coordinates 


POSAC calculates a partially ordered scalogram with coordinates. 


Setup: 


* POSAC 
Ме ОЭЕ 
* MODEL 
SAVE 
HOT * ESTIMATE 


MODEL command 


MODEL varlist 


Specifies the items to be scaled. 


ESTIMATE command 


ESTIMATE 


Runs the problem. 


| ITERATIONS-n The default is 50. 


CONVERGE-d sets the convergence criterion. This is the largest relative change in 
any coordinate before iterations terminate. 
The default is 0.00001. 


SAVE command 


SAVE filename 


Saves specified statistics to à file. 
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POWER: 
Power Analysis 


POWER determines the sample size requirements and plots the power curve relating 
power and sample size for one-way and two-way analysis of variance, z tests, t tests, 
tests for proportions, and correlational tests. In addition, sample size can be determined 
for any experiment in which the power at the alternative hypothesis is given by a non- 
central F-distribution, provided certain restrictions apply. You can also calculate the 
power corresponding to specified sample sizes for any of these tests. Results can be 
saved to data files for subsequent analyses, including overlay plots of power curves and 
power surfaces. Furthermore, using an iterative process, you can explore the effects of 
changing hypothesized parameters (effect size) and alpha level. 


Setup: 
* POWER 
* MODEL 
SAVE 
HOT * ESTIMATE 
MODEL command 


MODEL type 


Defines the hypothesis test used in the power analysis. Select a type from the following 
list: 


Type Design 
PROP1 single proportion 
PROP2 equality of two proportions 
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CORR1 single correlation coefficient 
CORR2 equality of two correlations 
21 one-sample z test 

Z2 two-sample z test 

T4 one-sample t-test 

PAIRED paired t-test 

T2 two-sample t-test 


ONEWAY one-way analysis of variance 
TWOWAY two-way analysis of variance 
GENERIC general approach to tests employing the non-central F. -distribution 


Optional specifications depend on the test being used. 


MODEL РКОРІ 


Use this test to determine whether a single proportion differs from a specified value. 
Designate the parameters of the test using the following options: 


/P1=p hypothesized proportion according to the alternative hypothesis. 
NULL = q hypothesized proportion according to the null hypothesis. 
ALTER= NE 

ma alternative hypothesis p # q, p«q, or p?q. 

MODEL PROP2 


This model tests for the equality of proportions for two independent groups. Designate 
the parameters of the test using the following options: 


/P1=p hypothesized proportion for the first group. 
Р2=9 hypothesized proportion for the second group. 
RATIO = ratio of the sample size for the first group to the sample size for the 
d second group. The default is 1 (equally sized groups). 
ALTER- NE d 
LT alternative hypothesis p + q , p*q, OF p?q. 


GT 
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MODEL CORRI 


Use this correlation test to determine whether a correlation coefficient differs from a 
specified value. Specify the hypotheses using the following options: 


| COEF1= p population correlation coefficient according to the alternative 


hypothesis. 
NULL- д population correlation coefficient according to the null hypothesis. 
The default is 0. 
ALTER= en alternative hypothesis p * 2, p «f, or p» f. 
GT 
MODEL CORR2 


This test addresses the equality of two correlation coefficients. The following options 
are available: 


| COEF1=p1 population correlation coefficients according to the alternative 
COEF2=p2 hypothesis. 


N12 тї sample size for each correlation. In the absence of N1 and N2, the 
N2= m2 software assumes samples of the same size. 
ALTER- НЕ alternative hypothesis pl # p2, p 1<р2, or р1>р2. 
GT 
MODEL 71 


This model tests whether or not a mean equals a known constant, assuming the 


population standard deviation is known. Designate the parameters of the test using the 
following options: 


/M1= u1 mean according to the alternative hypothesis. 
NULL= “0 mean according to the null hypothesis. 
SIGMA= o population standard deviation. 


ALTER=NE alternative hypothesis 41 * 0, u 1«40, or 
a, u 1240. 
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MODEL 72 


The two-sample z test addresses the equality of two means, assuming the two 
population standard deviations are known and equal to each other. Designate the 
parameters of the test using the following options: 


| POOLED = o pooled standard deviation across the two groups. 


Specify the alternative hypothesis using one of the following methods: 


M1 = д1 hypothesized mean for each group. 

M27 42 

RANGE-d hypothesized mean difference between the two groups. 

STEFF = с standardized mean difference, equal to the absolute mean difference 
divided by the pooled standard deviation. Using STEFF in place of the 
means or mean difference eliminates the need to specify the pooled stan- 


dard deviation. 
ALTER= НЕ alternative hypothesis 41 # u2, u1«u2, or и1>и 2. 
GT 
MODEL Т1 


This model tests whether or not a mean equals a known constant, assuming the 
population standard deviation is unknown. Designate the parameters of the test using 


the following options: 


| M41 mean according to the alternative hypothesis. 
NULL- 4/0 mean according to the null hypothesis. 
SD-6 hypothesized population standard deviation. 
ALTER-NE alternative hypothesis 41 * u0, и1<и0, or 
LT 41240. 


GT 
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MODEL PAIRED 


The paired t test addresses the equality of means for two related groups. The following 
options are available: 


/ DIFF =d hypothesized mean difference. 
WITHIN = с standard deviation of the differences between the paired responses. 
ALTER= БЕ alternative hypothesis d + 0, d«0, or d»0. 
GT 
MODEL T2 


This model tests for the equality of means for two independent groups, assuming the 
two population standard deviations are unknown but equal to each other. Designate the 
parameters of the test using the following options: 


/ WITHIN = o the hypothesized pooled standard deviation across the two groups. 


Specify the hypothesized effects using one of the following methods: 


Mi-241 the two hypothesized population means. 

М2 = 42 

RANGE = а hypothesized mean difference between the two groups. 
STEFF = с standardized mean difference, equal to the absolute mean 


difference divided by the hypothesized pooled standard 
deviation. Using STEFF in place of the means or mean difference 
eliminates the need to specify the pooled standard 


deviation. 
ALTER= КЕ alternative hypothesis 41 4 и2, и1<д2, ог 41242. 
GT 
MODEL ONEWAY 


One-way analysis of variance tests for the equality of means for two or more 
independent groups, assuming the population standard deviations for the groups are all 
equal. The following options are available: 


/ GROUPS=r number of levels for the independent variable. 
WITHIN = o within-cell standard deviation. 
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Specify the effects using one of the following three options: 


EFFECT = 1, 42, ...hypothesized effect for each group. The software centers the val- 


ues about zero by subtracting their mean. 


RANGE = d difference between the largest effect and the smallest effect. The 
software creates г uniformly spaced effects covering this range. 
AVGESQ=a average squared effect divided by the variance of a cell. Using 


AVGESQ in place of EFFECT or RANGE eliminates the need to 


specify the within-cell standard deviation. 


MODEL TWOWAY 


POWER: Power Analysis 


Two-way analysis of variance focuses on the effects of two categorical factors on a 
continuous response. The test assumes that the population standard deviations for the 
groups defined by crossing the factors are equal. The following options are available: 


Г ROWS = г number of levels for the row factor. 
COLUMNS = c number of levels for the column factor. 
WITHIN = с within-cell standard deviation. 
REFFECTS identifies the set of effects used in the calculations. REF FECTS 
CEFFECTS and CEFFECTS correspond to the row and column main effects, 
IEFFECTS tively. To compute estimates based on interaction effects, 


respec 
use IEFFECTS. The default is REFFECTS. 


Specify the hypothesized effects using one of the following methods: 


EFFECT = 41, 42, hypothesized effect for each group. The software centers the val- 
ж ues about zero by subtracting their mean. When using row effects 
(REFFECTS), enter r values. For CEFFECTS, enter c values. 


For IEFFECTS, enter rxc values. 


RANGE = d difference between the largest effect and the smallest effect. The 
software creates г, С, or xc uniformly spaced effects covering this 
range. 

AVGESQ =a average squared effect divided by the variance of a cell. Using 


AVGESO in place of EFFECT or RANGE eliminates the need to 


specify the within-cell standard deviation. 


MODEL GENERIC 


Use this model for any experimental design in which the power 


central F-distribution. Three conditions must be satisfied: 
m The numerator degrees of freedom must be fixed. 


depends on a non- 
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= A linear relationship exists between the denominator degrees of freedom and the 


sample size. 
df» = Ci*n—Cy 


m A linear relationship having an additive constant of zero exists between the 
noncentrality parameter and the sample size. 
A= æn 


The following options are available: 
| NDF = ағ degrees of freedom for the numerator of the F-ratio. 


С1=т the slope (C1) and the intercept (CO) for the linear regression of the denomina- 
CO=b tor degrees of freedom on the number of cases in each cell. 


NCP = ó noncentrality factor. The noncentrality parameter for the F distribution equals 
ёхп. 


Examples: 
MODEL Т2 / М1=100 M2=110 WITHIN=25 
MODEL ONEWAY / GROUPS=3 EFFECT=10,15,20 WITHIN=10 


MODEL TWOWAY / ROWS=2 COLUMNS=3 IEFFECTS AVGESQ=0.5 


MODEL GENERIC / NDF=3 С1=4 C074 NCP=1.1 


SAVE command 


SAVE filename 


Saves sample size estimates with corresponding power values to a file. 
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ESTIMATE command 


ESTIMATE 
Initiates the power or sample size calculations. 


Г ALPHA = а specifies the probability of a Type Lerror. The default is 0.05. 
POWER 7 p specifies the probability of correctly rejecting the null hypothesis. 


The software determines the smallest sample size needed to meet 
or exceed p. The default value equals 0.80. 
LOW = min calculates power over a range of sample sizes. LOW defines the 
HIGH = max size of the smallest sample, HIGH defines the size of the largest 
INCREMENT = step sample, and INCREMENT equals the size difference between any 
two consecutive samples. To return the power for a single sample 
size, specify a value for LOW only. 


Example: 
ESTIMATE / ALPHA=0.01 LOW=50 HIGH=150 INCREMENT-5 


Returns the power for twenty-one sample sizes (50, 55, 60, ... , 150) using an alpha 
level of 0.01. 
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PROBIT: 
Probit Analysis 


The PROBIT module calculates maximum-likelihood estimates of the parameters of the 
probit general linear model. It provides an appropriate method for estimating a 
multiple regression or analysis of variance or covariance when the dependent variable 
is categorical and can take only one of two values. 


Setup: 


* PROBIT 
= USE 
* MODEL 
CATEGORY 
SAVE 
HOT * ESTIMATE 


MODEL command 


MODEL depvar = CONSTANT + indvarexpr 


Specifies a model to be estimated. 


CATEGORY command 


CATEGORY grpvarlist 


Specifies numeric or string grouping variables that define cells, Specify for all 
categorical variables for which PROBIT should generate design variables. 


/ MISS allows cases with a missing value for the categorical variable to be included 
in the analysis. 


EFFECT produces parameter estimates that are differences from group means. 
DUMMY produces dummy codes for the design variables instead of effect codes. 
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ESTIMATE command 


ESTIMATE 


Causes the previously specified model to be estimated. 


SAVE command 


SAVE filename 


Saves the variable’s z score, the predicted z score from the last model estimated, and 
MILLS, the hazard function evaluated at the predicted z score. By using the cumulative 


normal probability function in the DATA module, the z score is converted into a 


predicted probability. The MILLS variable is often used as a selectivity bias correction 


variable in regression models with nonrandom sampling. In addition, 


(standard errors), PROB (corresponding probability), DENSITY (associated density 


value), and confidence intervals for the parameters are saved. 
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ОС: 


Quality Control charts 


Setup: 


QC provides 11 types of Shewhart control charts (X-bar, variance, s, R, X-bar and s, X- 
bar and R, X, np, p, c or и) as well as Histogram, Pareto chart, Box-and-Whisker plot, 
Cumulative sum (CUSUM) chart, Moving average chart, Exponentially weighted 
moving average (EWMA) chart, Individual and moving range (XMR) chart, Regression 
chart, and Hotelling's T? (TSQ) chart. SYSTAT uses the statistical distribution function 
that is appropriate for each chart while computing the control limits. However the 
traditional "sigma limits" approach is also available as an option. SYSTAT also 
provides operating charateristic (OC) and average run-length (ARL) curves for eight 
statistical distributions. QC also provides process capability indices and process 
performance indices to assess the uniformaity of a process for normal as well the 
following non-normal distributions: Beta, Exponential, Gamma, Inverse Gaussian, 
Lognormal, Rayleigh and Weibull. 


*QC 

* USE 

* SAVE 
PLENGTH 


HOT * HISTor PARETO or BOX or RUNCHART or SHEWHART or ARL or OC or MA 
or EWMA or XMR or QCREGRESS or CUSUM or TSQ or PCA 


SAVE command 


SAVE filename 


Saves the table provided in the LONG output to a file. 
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PLENGTH command 


PLENGTH SHORT 
MEDIUM 
LONG 


Control charts provide graph and basic statistics in SHORT output. LONG output 
additionally provides chart data. Process Capability Analysis produces chart and 
capability indices for SHORT output and adds observed performance, expected 
capability and expected performance of the process in the LONG output. 


HIST: Histogram 


HIST xvar 


Displays the density of a quantitative variable. The options and features to produce 
these displays are described below. 


/ BARS=n number of bars to use in a histogram. 
BWIDTH=n width of bars in a histogram. 
CUM use with HIST to display cumulative densities. 


PARETO: Pareto chart for frequencies 


PARETO yvar*xvar 


Produces a chart showing frequencies of occurrence sorted in descending order, plotted 
as a function of an x variable that identifies each symbol. The yvar must be 0's and 1's 
if individual instances of an event are in the input file. If the yvar contains the numbers 
of instances of an event already aggregated by sample in the input file, then the 
program must be informed by using the AGG option. 


by default, the PARETO chart shows frequencies by sample (by xvar identifier); if 

you specify CUM, it shows cumulative frequencies. 

P produces a chart showing proportions (relative frequencies). You can use CUM and P 
both to get cumulative proportions. 

AGG indicates the data are aggregated. 


/ CUM 
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BOX: Box-and-whisker plots 


BOX yvar*xvar 


Produces a chart showing a series of box and whisker plots for a yvar as a function of 
an xvar that identifies samples. Note that SAVE is not relevant for box and whisker 
plots. 


The following options are available: 


/ CENTER=n value for the center line of the chart. 
LIMITS=n1,n2,....n12 a priori control limits for the chart. Up to 12 values of limits can 
be given. 
RUNCHART : Run Chart 


RUNCHART yvar 


Produces a Run chart that shows the value of quality characteristics (yvar). 


The following options are available: 


/ GROUP = groupvar group variable, based on which the grouping on X-values can be done 


may be specified. 
TREND checks for trend in the chart, if any. 
SHIFT checks for possible shift in the process from the center. 


PATTERN checks whether any definite pattern exists or not in the chart. 
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SHEWHART yvar"xvar / TYPE=type 


Produces a Shewhart chart for a yvaras a function of the xvar, using one of these types: 


XBAR Mean (default) X 
VAR Variance NP 
S Standard deviation P 
R Range [o 
XBAR S Means and standard deviations U 
XBAR R Means and ranges 


The following options are available: 


Individual cases 
Binomial count 
Binomial proportion 
Poisson count 
Poisson rate 


| PLIMITS=p1,p2,.-..P12 proportion of the sample statistic values expected to be below each 
control limit. The default values are 0.99865 and 0.00135 for upper 


and lower limits, respectively. You can specify up to 12 limits. 


LIMITS-n1,n2,...,n12 
SLIMITS-n1,n2,...,n12 


a priori values for control limits with up to 12 values. 


specify up to 12 values. SLIMIT values are assumed to be in units 
of the standard error of the statistic being plotted. 
a priori value to use for the center line of the chart. By default, 


CENTER-n 
these values are computed from the data. 
SIGMA=n apriori value for the population-within-sample standard deviation. 
By default, these values are computed from the data. 
AVGN causes control limits to be computed using average subgroup size 


for each subgroup. 


AGG 
AGG=TOTAL 


indicates that input data are already aggregated by subgroup. (Use 
the FREQUENCY command to indicate subgroup sizes if you use 


AGG) AGG=TOTAL is for P or U charts only, meaning that the 


aggregate is the total count rather than the proportion. 


YMIN=n, YMAX-n 


Z prints chart data in standard 


SIGMA=1. 
TEST=0 12345678 perfi 


minimum and maximum scale values for the Y(horizontal) axis. 
deviation units with CENTER=0, 


orms the specified run tests. All 8 tests are performed by 


default. To perform no test, select TEST = 0 


298 
ОС: Quality Control charts 


ARL: Average run length curve 


ARL / TYPE=type N=n 


Produces an Average Run Length curve of the type specified, where type can be: 


XBAR Mean (default) S Standard deviation 
NP Binomial count C Poisson mean 
VAR Variance К Range 

p Binomial proportion U Poisson rate 


You must specify N 7 n, where n is the sample size: 


The following options are available: 


/ N2n sample size. The default is 1 for all but TYPE=VAR, S, or R for which 
the default is 2. 

PLIMITS=p1,p2 proportion of the sample statistic values expected to be below the control 
limits. The default values are 0.99865 and 0.00135 for upper and lower 
limits respectively. No more than two limits are accepted for PLIMITS. 

CENTER=n null hypothesis value for the sampling distribution.CENTER is the 
expected value of the sampling distribution when the null hypothesis is 
true. The default is 0 for TYPE=XBAR. 

SIGMA=n a priori value for the population within-sample standard deviation. The 
default value is 1 for TYPE=XBAR, VAR, S, ог В. 

YMIN=n,YMAX=n, lower and upper limits for the X and Y axes. 

XMIN=n, XMAX=n 


OC: Operating characteristic curve 


OC / TYPE=type N=n 


Produce an Operating Characteristic curve showing the probability of a type II error 
(beta) as a function of a range of possible expected values for a stated parameter type, 
where type can be: 


XBAR Mean (default) NP Binomial count 
VAR Variance P Binomial proportion 
S Standard deviation C Poisson mean 

R Range U Poisson rate 
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You must specify М = п, where п is the sample size. 


The following options are available: 


/N=n sample size. The default is 1 for all but TYPE=VAR, S, or R, for which 
the default is 2. 

PLIMITS=p1,p2 proportion of the sample statistic values expected to be below the control 
limits. The default values are 0.99865 and 0.00135 for upper and lower 
limits respectively. No more than two limits are accepted for PLIMITS. 

CENTER-n null hypothesis value for the sampling distribution. CENTER is the 
expected value of the perc distribution when the null hypothesis is 
true. The default is 0 for PE-XBAR. 

SIGMA=n ES. standard deviation. The values specified for SIGMA, N, and 

LIMITS are used to determine control limit values. These, in turn, deter- 
mine the value of BETA for a given expected value. The default is 
SIGMA-1 for TYPE=X, VAR, S, or К. 


YMIN=n, YMAX=n, lower and upper limits for the X and Yaxes. 
XMIN=n, XMAX-n 


CUSUM: Upper and lower cumulative sum chart 


CUSUM yvar*xvar 


Produces upper and lower Cumulative Sum charts for a yvar as a function of the xvar, 
which identifies samples. 


The following options are available: 


| K-n use n to compute cumulative sums. The default is 0.5. 

H=n control limit for the cumulative sum. The default is 5. 

START=n start for initial sum. The default is 0. 

UPPER by default, CUSUM plots both upper and lower Cumulative Sum charts. 

LOWER Select one of these to plot just one. 

ZCL=n when you set PLENGTH LONG, the ZCL option flags the tabular output 
for cases whose absolute 2 values exceed the absolute value of the п you 
specify. 

CENTER=n a priori center value. 

SIGMA=n a priori value for the population standard deviation. 

AGG indicates that input data are already aggregated by sample. Use the 


FREQUENCY command to indicate the sample size. 
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MA: Unweighted moving average charts 


MA yvar*xvar 


Produces a plot that shows the unweighted Moving Average of a yvar, a center line, and 
upper and lower control limits as a function of an xvar identifying each sample. 


The following options are available: 


/ WIDTH=n 


PLIMITS=p1,p2,...,012 


LIMITS=n1,n2,...,012 


SLIMITSzn1,n2,...,n12 


CENTER=n 


SIGMA=n 
AGG 


AVGN 


YMAX=n, YMIN=n 


number of samples over which the moving average is to be com- 
puted. The data for a given sample are averaged with the data from 
the previous n-1 samples to obtain the moving average for the 
given sample. The number of cases used to compute limits for a 
sample is the number of cases in the sample plus the number of 
cases in the previous n-1 samples. The default is 1, which produces 
a simple Shewhart X chart. 


proportion of the sample statistic values expected to be below the 

control limits. The default values are 0.99865 and 0.00135 for 

a and lower limits respectively. You can specify up to 12 
TS. 


values for the lower and upper control limits, overriding any values 
of SIGMA for computing limits. 


control limits in units of the standard error of the statistic to be 
plotted. SLIMITS are assumed to be in units of the standard error of 
the statistic being plotted. 


value for the center line. By default, these values are computed 
from the data. 


SIGMA value. By default, this value is computed from the data. 


indicates the data are already aggregated by sample. You must use 
the oe command to indicate sample sizes if you use 


if sample sizes for each sample are not the same, default control 
limits fluctuate from sample to sample on the chart, because the 
value of the limits depends on sample size. AVGN causes the 
functional sample size for each sample to be the average of all of 
the individual sample sizes, 


rescales the Y axis of the chart. 


TEST= 0 12345678 performs the specified run tests. All 8 tests are performed by 


default.To perform no test select TEST = 0 
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EWMA: Exponentially weighted moving average charts 


EWMA yvar*xvar 


Creates a plot that shows the Exponentially Weighted Moving Average of a yvar, a 
center line, and control limits as a function of an xvar that identifies each sample. The 
plot has a center line and upper and lower control limits. 


The following options are available: 


/K=n weight constant which must be between 0 and 1 for computing 
the exponentially weighted moving average. The default is 1. 
CENTER=n value for the center line. By default, CENTER is computed from 
the data. 


LIMITS-n1,n2,...,n12 values for the lower and upper control limits, overriding any val- 
ues of SIGMA and PLIMITS for computing limits. By default, 
LIMITS are computed from the data. Up to 12 values can be spec- 
ified for LIMITS. 

PLIMITS-p1,p2,..p12 proportion of the sample statistic values expected to be outside 
the control limits. The default values are 0.99865 and 0.00135 for 
upper and lower limits, respectively. 


SIGMA=n SIGMA value. By default, SIGMA is computed by the data. 

AGG indicates that input data are already aggregated by subgroup. 
(Use the FREQUENCY command to indicate subgroup sizes if 
you use AGG.) 

AVGN causes control limits to be computed using average subgroup size 
for each subgroup. 

YMAX=n, YMIN=n rescales the Y axis of the chart. 


TEST=0 12345678 performs the specified run tests. All 8 tests are performed by 
default.To perform no test select TEST = 0 


XMR: XMR chart 


XMR yvar 


g the X-chart (for the given yvar) and Moving Range 


Produces an X-MR chart showin i 
It is 2) plotted in the same graph window one after 


chart (for a specified width, defau 
another. 


302 


QC: Quality Control charts 


The following options are available: 


/ WIDTH=n 


PLIMITS= p1, p2 
SLIMITS= n1, n2 


CENTER=n 
SIGMA=n 
MR=n 
MEAN 


MEDIAN 


TESTS=0 12345678 


OCREGRESS: Regression chart 


you can specify the number of samples over which the moving 
average will be calculated. The default value is 2. 


proportion of sample statistic values expected to be below the 
control limits. You can give 2 PLIMITS signifying the lower 
and upper probability limits, 

control limits in units of the standard error of the statistic to be 
plotted. You can give 2 SLIMITS signifying the lower and 
upper sigma limits. 

value for the center line for X-chart. By default it is computed 
from the data. 


pena value for X-chart. By default it is computed from the 
ta. 


value for the center line for MR-chart. By default it is com- 
puted from the data, 

specifies that the sigma calculations are done based on mean of 
MR. This is the default situation; so you need not specify it. 


specifies that the sigma calculations are done based on median 
of MR. Since this is not the default case, you have to specify it 
if you want to perform it. 


performs the specified run tests. All 8 tests are performed by 
default. To perform no test, select TEST = 0 


QCREGRESS yvar*xvar 


Creates a regression chart showing a yvar regression line and prediction limits as a 


function of xvar. 
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The following options are available: 


proportion of the sample statistic values expected to be outside the 
control limits. The default values are 0.99865 and 0.00135 for 
lower and upper control limits, respectively. You can specify up to 
12 PLIMITS. 
sigma limits can be specified with up to 12 values. SLIMIT values 
SLIMITS=n1,n2,...,.912 are assumed to be in units of the standard error of the statistic 
being plotted. 
standard error of the estimate for the regression line. It can be 
SIGMA=n specified a priori, but by default, the regression line and SIGMA 
are calculated from the data. 


/ PLIMITS=p1,p2,...,.p12 


YMIN=n, YMAX=n, 


XMIN-n, XMAX=n lower and upper limits for the X and Y axes. 


TSO: Hotelling's Т? chart for multiple y variables 


TSQ yvarlist*xvar 


Produces a control chart of Hotelling’s T? for up to 10 yvars as a function of an xvar 
that identifies each sample. 
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The following options are available: 


І COVAR-filename 


specifies a file containing an a priori variance-covariance matrix 
and vector of means. This matrix must be in a SYSTAT file and 
must be in a specific form. For k yvars, the first k variables in the 
COVAR file must contain the variance-covariance matrix. This 
must be followed by one variable containing k means for the yvars. 
You can have other variables in the file after this. By default, SYS- 
TAT computes the pooled variance-covariance matrix and the 
grand means of the variables in yvarlist from the data. 


PLIMITS-p1,p2,...,p12 proportion of the sample statistic values expected to be outside the 


LIMITS=n1,n2,...,.912 


AVGN 
AGG 


YMIN=n, YMAX=n 


control limits. The default values are 0.99865 and 0.00135 for 
lower and upper control limits, respectively. You can specify up to 
12 PLIMITS, 

values for the lower and upper control limits, overriding any values 
of SIGMA and PLIMITS for computing limits. By default, LIMITS 
are computed from the data. Up to 12 values can be specified for 
LIMITS. 

equal sample sizes are required for this chart. AVGN will cause 
control limits to be computed by using the average sample size. 
indicates that input data are already aggregated by sample. Use the 
FREQUENCY command to indicate the sample size in this case. 


limits the Y axis to the specified range of values. 


PCA: Process capability analysis 


PCA varname / DIST = distname 


Computes various process capability indices and process performance indices 
including some basic statistics. It also provides observed and expected process 
performance. The range of varname depends on the choice of the distribution, distname 
can be one of the following: 


Command Script 
NORMAL 

BETA 
EXPONENTIAL 
GAMMA 

INVERSE GAUSSIAN 
LOGNORMAL 
RAYLEIGH 

WEIBULL 


Distribution Data Range 
Normal (default) (—90,90) 
Beta (0,1) 
Exponential (0, œ) 
Gamma (0, œ) 
Inverse Gaussian (0, ©) 
Lognormal (0, ©) 
Rayleigh (0, ©) 


Weibull (0, œ) 
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The following options are available: 


/ USL=n 
LSL=n 
NOMINAL 


SIGMATOL = n 
SIZE = n or varname 


BOXCOX 


upper and lower specification limits and nominal value for a 


process. 
NOMINAL is applicable only when a distribution choice is normal. 


specifies sigma tolerance as process spread. The default is 6. 
specifies subgroup size or variable name үн, the subgroup. It is 
applicable only when distribution choice is normal. 

orms the analysis after using suitable Box-Cox transformation 
of the variable used for analysis. This is applicable only when 
distribution choice is normal. The data range in this 
case should be (0, ©). 
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RAMONA: 
Path Analysis or Structural Equation Models 


RAMONA, implements the McArdle and McDonald (1984) Reticular Action Model 
(RAM) for path analysis with manifest and latent variables. Input to the program is 
coded directly from a path diagram without reference to any matrices. 

RAMONA stands for RAM Or Near Approximation. The deviation from RAM is 
minor—no distinction is made between residual variables and other latent variables. 
As in RAM, only two parameter matrices are involved in the model. One represents 
single-headed arrows in the path diagram (path coefficients) and the other represents 
double-headed arrows (covariance relationships). 

RAMONA can correctly fit path analysis models to correlation matrices, and avoids 
the errors associated with treating a correlation matrix as if it were a covariance matrix 
(Cudeck, 1989). Furthermore, you can request that both exogenous and endogenous 
latent variable variances have unit variances. Consequently, estimates of standardized 
path coefficients, with the associated standard errors, can be obtained, and difficulties 


associated with the interpretation of unstandardized path coefficients (Bollen, 1989, 
pp. 123-126, 349-350) can be avoided. 


Setup: 
* RAMONA 
* USE 
MANIFEST 
LATENT 
* MODEL 
PLENGTH 
HOT * ESTIMATE 
MODEL command 


MODEL depvar1 <- explanvar1(i,n1) explanvar2(j,n2),...,e1 <-> e2, e3(k,n3)... 


Specifies the causal model. Covariance (<->) and dependence (<-) relationships can be 
specified in any order. Parameter numbers and values, if not the default values, are 
specified in parentheses after the variable name. Include a similar statement for each 
arrow in your diagram, where depvari is dependent variable i; explanvari, explanatory 
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variable i; ej, measurement error i i, j, and k, integer parameter numbers used to fix or 
constrain parameters; and ni, starting values. 
Example: 


MODEL ANOMIA67 <- ALNTN67 (*,*) E1 (0,1.0), 
SES «-» SES(0,1.0) 


MANIFEST command 


MANIFEST varf, var2, ... 


Specifies the names of manifest variables in your model. Unless MANIFEST is used, 
any SYSTAT file variable names in the MODEL are considered manifest. MANIFEST 
without a variable list clears the current list of manifest variables. 


LATENT command 


LATENT var, var2, ... 


Specifies the names of latent variables in your model. Unless LATENT is used, any 
unrecognized variable names in the MODEL are considered latent. LATENT without a 


variable list clears current list of latent variables. 
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PLENGTH command 


PLENGTH SHORT 


MEDIUM 
LONG 


To request extended results, enter PLENGTH prior to ESTIMATE. PLENGTH offers three 
categories of output: 


SHORT 


MEDIUM 


LONG 


produces the sample covariance (correlation) matrix; a table consisting of 
path coefficient estimates, 90% confidence intervals, standard errors, and t 
Statistics (estimate divided by standard error); a table consisting of variance 
and covariance or correlation estimates, 90% confidence intervals, standard 
errors, and / statistics; measures of / of the model. 

produces the panels listed for SHORT plus: details of the iterative procedure, 
the reproduced covariance or correlation matrix, the matrix of residuals, 
information about equality constraints on variances (if applicable). 

produces the panels listed for SHORT and MEDIUM plus the asymptotic cor- 
relation matrix of the estimators. 
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ESTIMATE 


Initiates estimation of the causal model. 


/ DISP=COVA 
CORR 


or 


TYPE=COVA 
CORR 


METHOD=MWL 
GLS 
OLS 
ADFG 
ADFU 


START=ROUGH 
CLOSE 


NCASES=n 


ITER=n 
CONVG=n 
RESTART 
CONFI=n 


Example: 


specifies whether a covariance or correlation matrix is analyzed. With 
COVA, the covariance matrix is analyzed. If the input matrix is a corre- 
lation matrix (has unit diagonal elements), the analysis is performed but 
a warning is printed that some results are incorrect. With CORR, the 
correlation matrix is analyzed. If a covariance matrix is input, it is 
eom to a correlation matrix within the program. The default is 

A. 
methods of estimation. MWL is maximum Wishart likelihood. GLS is 
generalized least squares assuming a Wishart distribution for S. OLS is 
ordinary least squares. With OLS, no measures of fit and no standard 
errors of estimators are printed. ADFG is asymptotically distribution- 
free estimate. ADFG uses a biased but Gramian (nonnegative definite) 
estimate of the asymptotic covariance matrix G, of the elements of the 
sample covariance matrix. ADFU is an asymptotically distribution-free 
estimate that uses an unbiased estimate of G. If using OLS, then no mea- 
sures of fit and no standard errors of estimators will be output. If using 
ADFG or ADFU, then a cases-by-variables file must be used. The 
default is METHOD=MWL. 
rescales starting values to satisfy the specified variance constraints and 
yields Diag[est. Sj-Diag[S]. RAMONA applies OLS initially. After par- 
tial convergence, RAMONA switches to the procedure specified by 
METHOD. If you require MWL estimates and supply poor starting val- 
ues or if you use the * alternative for the starting value, use ROUGH. It 
is also advisable to use ROUGH with ADFG and ADFU if starting val- 
ues are poor since the time taken per iteration is less for GLS than for 
ADFG and ADFU. 
must not be omitted if a correlation or covariance matrix is input or if a 
NEXT specifier is used. Note that NCASES should exceed the number 
of manifest variables, p, if METHOD=MWL or GLS and must exceed 
0.5р(р+1) if METHOD=ADFG or ADFU. 
maximum number of iterations allowed for the iterative procedure. The 
default is 100. 
tolerance limit for the residual cosine employed by the program as a 
convergence criterion. The default is 0.0001. 
saves commands from the current run with estimates of parameters 
inserted by RAMONA. Use with the BATCH features of OUTPUT. 


specifies a confidence interval range. 


ESTIMATE / DISP=CORR CONVG=0.000001 ITER=500 NCASES=932 
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RANDSAMP: 
Random Sampling 


RANDSAMP generates random samples from a specified distribution with specified 


parameters. The distribution that is specified may be one of 37 univariate discrete and 
continuous distributions. 


Setup: 
* RANDSAMP 
SAVE 
HOT * UNIVARIATE 


Note: GENERATE command used in previous version is no more required. 
Univariate discrete and univariate continuous random sampling 


UNIVARIATE command 


UNIVARIATE distribution notation(parameter list) 


Specifies a distribution with corresponding parameter values as arguments. 
! SIZE- n1 size of the random sample to be generated. The default is 1. 


NSAMPLE-n2 number of samples (columns) each of a specified size to be generated. 
The default is 1. 


RSEED=n random seed. 
Examples: 


UNIVARIATE NRN(16, 0.5) 


UNIVARIATE ZRN(0,1.2) 
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SAVE command 


SAVE filename 


Saves generated samples to a file. 
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RDISCRIM: 
Robust Discriminant Analysis 


RDISCRIM provides robust estimates of location and scatter matrices, discriminant 
functions, classification matrix etc. SYSTAT has options to perform linear or quadratic 
discriminant analysis. Print options allow the user to select a part of output to display 
including group frequencies, group means, robust covariance matrices and 
correlations. The user can ask to display Mahalanobis distances and robust distances. 
The user can save Mahalanobis distances, robust distances, predicted group 
membership, and weights. 


Setup: 
* RDISCRIM 
Ы USE 
MODEL 
PLENGTH 
3 SAVE 
PLENGTH 
HOT * ESTIMATE 
MODEL command 


MODEL grpvar - varlist 


Specifies the model to estimate. 


/ QUADRATIC quadratic robust discriminant analysis. If omitted, linear robust discrimi- 
nant analysis is performed. 


Example: 


MODEL SPECIES$ = PETALWID PETALLEN SEPALWID / QUADRATIC 
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PLENGTH command 


PLENGTH NONE 
SHOR 
LONG 


Controls the amount of output reported and requests specific output panels to print. 
Optionally, include one or more panels not provided with your argument specification. 


The argument SHORT requests the following four features: 


1 MEANS group frequencies and means. 


GCOV group covariance matrices. 
CFUNC coefficients of the classification functions. 
CLASS classification matrix. 


LONG requests the output for SHORT above plus the following features: 


/ GCOR group correlation matrix. 


OFREQ outlier frequencies in each group. 
RDIST robust distances, Mahanlobnis distances, weights, predicted group member- 
ship. 


Examples: 


PLENGTH MEDIUM 
PLENGTH NONE / GCOR CFUNC CLASS OFREQ 
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SAVE command 


SAVE filename 


Saves specified statistics to a file. As in the classical discriminant analysis, you can 
save distances and distances with data. 


/ DATA the data (with transformations created in the current run). 


DISTANCES for each case it saves Mahalanobis distances, robust distances, predicted 
group membership, weights and misclassification variable. 


ESTIMATE command 


ESTIMATE 


Tells SYSTAT to estimate the parameters specified in the model. 
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REGRESS: 
Linear Regression 


REGRESS estimates and tests simple and multiple linear regression models. The model 
for simple linear regression is 


у= Во+ Bixi +E 


where y is the dependent variable, x, the independent variable, and the В'ѕ are the 
regression parameters (the intercept and the slope of the line-of-best fit). The model for 
multiple linear regression is 


y = Bo + Bixı + Вохо + - + BpXp + E 


Variations of this model include mixture models (constraining the independent 
variables to sum to a constant), polynomial regression, and stepwise regression 
(choosing a subset of predictors from among many candidates). For the latter, you can 
direct SYSTAT interactively at the terminal, specifying which variable to enter or 
remove at each step, or let SYSTAT automatically select variables. For mixture models 
and polynomial regression, see GLM. Stepwise procedures are available in both 
REGRESS and GLM. 

Once the parameters of a model have been estimated, you can use GLM to test that 
a coefficient (or a set of coefficients) is 0 or test that a coefficient equals a hypothesized 
value. 

For each model you fit in REGRESS, SYSTAT reports RÈ, adjusted R?, the standard 
error of the estimate, and an ANOVA table for assessing the fit of the model. AIC, AIC 
(Corrected) and Schwarz's BIC values are also provided for each fitted model. For 
more information on AIC and Schwarz’s (1978) BIC in SYSTAT refer Chapter Linear 
Models, “ Variable Selection” in Statistics II. For each variable in the model, the output 
includes the estimate of the regression coefficient, the standard error of the coefficient, 
the standardized coefficient, tolerance, variance inflation factor (VIF), and at statistic 
for measuring the usefulness of the variable in the model. 


You can predict the dependent variable for the set of new observations of 


independent variables and save results of the analysis (predicted values, residuals, and 


diagnostics that identify unusual cases) for further use in examining assumptions. Input 


can be the usual cases-by-variables data file or a covariance, correlation, or sum of 


squares and cross-products matrix. 
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Setup: 


* REGRESS 
* USE 
* — MODEL 
SAVE 
PLENGTH 
SAMPLE 
HOT * ESTIMATE or START 
STEP 
(series of steps) 
STOP 
HYPOTHESIS 
EFFECT 
ALL 
SPECIFY (contrast language) 
AMATRIX [matrix] 
DMATRIX [matrix] 
HOT TEST 
SAVE 
HOT * PREDICT 


Model building 


MODEL command 
eiie dee etis Шз ы seit ie il eR dete 


MODEL var = CONSTANT + var1 + var2 + .. 


Specifies the linear model to estimate. CONSTANT is an optional parameter (when in 
doubt, include the CONSTANT). 


/ N=n when your data file is a symmetric matrix (for example, a covariance matrix), 
specify the sample size n that generated the matrix. 
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SAVE command 


SAVE filename 


Saves specified statistics to a file and displays the Durbin-Watson statistic, the first- 
order autocorrelation, and identify extreme outliers, if any. 


| COEF estimates of the regression coefficients. 

ADJUSTED saves the adjusted estimates. 

MODEL specifies variables from your MODEL statement in addition to the statistics 
given by RESID. 

RESID produces residuals, predicted values, and diagnostics. For а univariate model 
(one dependent variable), this includes the estimate, residual, leverage, Cook's 
D, Studentized residual, standard error of prediction, confidence and predic- 
tion limits 
for each case. For a multivariate model, SYSTAT saves the estimate, residual, 
and leverage. 

DATA saves data from original file. 

PARTIAL saves partial residuals. 

PREDICT saves estimate, standard error of predicted values, upper and lower confidence 
and prediction intervals of new set of observations. 

NEWDATA saves the new set of observations used in the model for PREDICT command. 


PLENGTH command 


PLENGTH MEDIUM 
LONG 


To request extended results, enter PLENGTH prior to ESTIMATE. Select one of two 


categories of output: 


MEDIUM eigen values of XX, condition indices, variance proportions and confidence 
interval for coefficients. 
LONG MEDIUM output plus correlation matrix of regression coefficients. 
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SAMPLE command 


SAMPLE BOOT(m,n) 
SIMPLE(m,n) 
JACK 


The argument m is the number of samples; the argument n is the size of each sample. 


For getting the summarized resampling output of regression coefficients, the above 
command should be given before the ESTIMATE command. 


/ CONFI-c penne a confidence level for bootstrap-based confidence interval. The 
fault value is 0.95. 


Example: 


MODEL PRICE = CONSTANT + DEMAND 
SAMPLE BOOT(1000,25) / CONFI=0.99 
ESTIMATE 


ESTIMATE command 


ESTIMATE 
Tells SYSTAT to estimate the analysis specified in MODEL. 


/ MIX estimates a mixture model. The independent variables should sum to a constant. 
TOL=n matrix inversion tolerance limit. The default is 0.001. 
CONFl=n displays the confidence intervals of the regression coefficients at the 


desired level of confidence.The confidence intervals are displayed only when the 
MEDIUM and LONG options of PLENGTH are active. The default level is 0.95. 
NTEST-KS checks assumption of normality by KS - Kolmogorov-Smirnov test Lilliefors) 
SW SW -Shapiro-Wilk test and AD - Anderson-Darling test. p 
AD 


é 
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Stepwise model building 


START command 


START 


Tells SYSTAT to produce preliminary information for a stepwise model building 
process (use STEP to continue). 


| FORWARD initializes REGRESS for forward stepping. SYSTAT reports results for 
Step 0. If omitted, all variables in MODEL are entered/removed from the 


model that pass FENTER FREMOVE limits. 

BACKWARD initializes REGRESS for backward stepping. 

TOL=n matrix inversion tolerance limit. The default is 0.001. 

ENTER=p probability of F-to-enter limit. 

REMOVE=p probability of F-to-remove limit. 

FENTER=n F-to-enter limit. Variables with F > n are entered into the model if TOL 
permits. The default is 4.0. 

FREMOVE=n  F-to-remove limit. Variables with F < n are removed. The default is 3.9. 

FORCE=n forces the first n variables in MODEL into the model. The default is 0. 


STEP command 


STEP no argument 
var or index 


Interactively select a variable to enter ог remove from the model at each step or 


automatically let SYSTAT select a candidate variable to move. For automatic 


stepping, specify no argument (values of FENTER and FREMOVE guide the selection) 


and include AUTO in the option list. For interactive stepping, specify no argument or 


specify the name (or index number) of a variable to move in (or out) of the model 
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(irrespective of FENTER and FREMOVE limits). A variable’s index is its order in the 
input data file. Repeat STEP for each variable you want to move. 


/ AUTO completes the stepping automatically 
ENTER=p probability of F-to-enter limit 
REMOVE=p probability of F-to-remove limit 
FENTER=n F-to-enter limit. Variables with F > п are entered if TOL permits. 
The default is 4.0. 
FREMOVE=n F-to-remove limit. Variables with F < n are removed. The default is 
3.9. 
STOP command 
STOP 


Stops the stepping and prints the final output. The model is recomputed using all cases 
that have no values missing for the final subset of variables (that is, fewer cases can be 
used at the last step than for this final report). 


Hypothesis testing 


HYPOTHESIS command 


HYPOTHESIS 


Tests hypotheses on a previous MODEL. You must enter HYPOTHESIS before you can 
use any of the following: 


EFFECT AMATRIX 
ALL DMATRIX 
SPECIFY TEST 


For descriptions, see GLM (General Linear Model). 
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PREDICT command 


PREDICT filename 


Based on the last estimated model, SYSTAT predicts the dependent variable for the set 
of new observations of independent variables in filename. 


| CONFI=n displays the confidence and prediction intervals of the predicted value 
with specified confidence level 
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RIDGEREG: 
Ridge Regression 


RIDGEREG performs computations by selecting the dependent variable, the 
independent variable(s) and a range of lambda values. This computation provides with 
HKB and LW estimates of optimal values of lambda (according to certain theories) and 
a plot (called ridge trace) from which an appropriate value of lambda, where the 
regression coefficients begin to stabilize, can be obtained. You could then use this 
value of lambda to get appropriate regression equations. 

In multiple regression analysis, the nature and significance of the relations between 
the predictor variables and the response variable are of interest. In practice, estimated 
regression coefficients tend to have a large sampling variability when predictor 
variables are highly correlated. This results in regression coefficients not providing 
precise information. The common interpretation of a regression coefficient as 
measuring change in expected value of the response variable when a given predictor 
variable is increased by one unit while all the other predictor variables are held 
constant is not fully applicable when multicollinearity exists. Ridge regression is one 
of the several methods that have been proposed as a remedy for multicollinearity 
problems by modifying the method of least squares to allow shrunken and biased 
estimators of regression coefficients. 


Setup: 
RIDGEREG 
* USE 
* MODEL 
SAVE 
HOT* ESTIMATE 
MODEL command 


MODEL depvar = CONSTANT + indvarexpr 


Specifies a model to be estimated. 
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SAVE command 


SAVE filename 


Saves standarized ridge coefficients corresponding to each of the lambda values to a 
file. 


ESTIMATE command 


ESTIMATE 
Causes the previously specified model to be estimated. 


Г LMIN=n minimum value of lambda 
LMAX=n maximum value of lambda 
LSTEP=n step value of lambda 


LAMBDA=n1 n2 user specified lambda values. 
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ROBREG: 
Robust Regression 


Robustregression (Rousseeuw and Leroy, 1987) is mainly used for detecting outliers 
and for getting stable regression coefficients in the presence of outliers. Outliers are 
extreme observations that may be caused by large errors, typographical errors in 
recording the data, or by other similar reasons. Several methods have been developed 
to deal with outliers in data; the ROBREG feature of SYSTAT provides the following 
most commonly used procedures. 


Least Absolute Deviation (LAD) regression, introduced by Boscovich in 1757 (Birkes 
and Dodge, 1993), estimates the regression coefficients by minimizing the sum of 
absolute residuals, instead of squares as in the ordinary least-squares regression. In 
КОВКЕС LAD regression uses two methods for estimation namely IRLS (Iteratively 
Reweighted Least-Squares) and SIMPLEX (modified simplex algorithm). 


M regression produces a robust fit for a multiple linear regression model by 
minimizing the sum of less rapidly increasing symmetric functions of the residuals. 


Least Median Squares (LMS) regression produces a robust fit by minimizing the 
median of squares of the residuals. 


Least Trimmed Squares (LTS) is a statistical technique for estimation of unknown 
parameters of a linear regression model by minimizing h out of n sum of squared 
residuals. 


Scale (S) regression produces robust fit for a multiple linear regression model by 
taking consideration of scale estimates of residuals. 


Rank regression is a non-parametric method based on the idea of using ranks of 
residuals instead of the observations themselves. The regression model uses the 
method of weighted median to calculate nonparametric estimates of the parameters of 
the linear model. 
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Setup: 
* ROBREG 
f USE 
Ы MODEL 
LAD or M or LMS or LTS or S or RRANK 
PLENGTH 
SAVE 
HOT * ESTIMATE 
MODEL command 


Specifies the linear model to estimate. 


MODEL depvar = CONSTANT + varlist 


CONSTANT is optional 


LAD command 


TellS SYSTAT to fit a multiple linear regression model by minimizing the sum of absolute 

residuals. 

/IRLS uses iteratively reweighted least-squares method for estimation. This is default 
option. {Ry 

SIMPLEX uses the modified simplex algorithm for estimation. 


M command 


Tells SYSTAT to fit a multiple linear regression model by using a robust function for 


downweighting the influence of extreme residuals. 
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/ POWER-n sum of the л power of absolute values of residuals. The default 
value is 1.5. 
TRIM=n n proportion of the residuals (those with the largest absolute 


values) are trimmed. Then, the sum of squares of the remaining 
residuals is minimized. The default value is 0.10. 

HUBER=p sum of MAD standardized residuals weighted by Huber (p). The 
default value is 1.7. 

HAMPEL = p1, p2, р3 sum of MAD standardized residuals weighted by Hampel (p1, p2, 
D3). The default values are 1.7, 3.4, and 8.5. 


T=df sum of df/(df* u? ), where u=residual/MAD (residuals) and df is the 
degrees of freedom for the г distribution. The default value is 5.0. 

BISQUARE=n sum of MAD standardized residuals weighted by BISQUARE (n). 
The default value is 7.0. 

RAMSAY=n sum of MAD standardized residuals weighted by RAMSAY (n). 

ANDREWS=n sum of MAD standardized residuals weighted by ANDREWS (n). 

TUKEY=n sum of MAD standardized residuals weighted by TUKEY (n). 


The parameters for HUBER, HAMPEL, T, BISQUARE, RAMSAY, ANDREWS and TUKEY 
are defined in MAD (Median Absolute Deviations from the median) unit of the 
residuals. 


LMS command 


Tells SYSTAT to fit a multiple linear regression model by minimizing the median of 
squares of the residuals. 


/QS uses Quick Search method. 
ES uses Exhaustive Search method. 
NSAMP=n specifies the number of sub-samples selected for the quick search 


method. 
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LTS command 


Tells SYSTAT to fit a multiple linear regression model based on the subset of size (H) 
whose least- squares fit possesses the smallest sum of squared residuals. 


| H=n1 specifies the size of subset to estimate LTS parameters. 
SSUBS =n2 specifies the size of subsets. When data size is small it is ignored. 
NCSTEP-n3 specifies the number of C-steps to compute preliminary estimate of 


regression coefficients. 

NREP=n4 specifies the number of replications to select initial p-subsets. If NREP is 
greater than n choose р+1 then it is ignored. 

NBSOL=n5 specifies the numbers of best preliminary solutions for which C-steps are 


carried until convergence. 
INTADJUST requests for intercept adjustment in preliminary as well as final LTS 
estimates. 


S command 


Tells SYSTAT to fit a multiple linear regression model based on a scale estimator of 


residuals. 

| NREP =n4 specifies the number of replications to select initial subsets. 
BDP =n3 specifies the breakdown point. 
C =n3 specifies the parameter C in the Tukey’s function to control 


efficiency of S estimates. 


COV -H1orH20rH3 requests the estimation of H1, H2, or H3-type asymptotic 
variance-covariance matrix of regression coefficients. 


RRANK command 


Tells SYSTAT to fit multiple linear regression model based on the ranks of residuals. 
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PLENGTH command 


PLENGTH MEDIUM 
LONG 


Controls the amount of output reported. To request extended results, enter PLENGTH 
LONG prior to ESTIMATE. 


For LTS, S, LMS, and LAD with SIMPLEX method Regression 


MEDIUM Scale estimate, Robust R-square, robust estimated coefficients, number of out- 
liers, and ordinary least-squares regression for outlier-free data. 


LONG MEDIUM output plus ANOVA in ordinary least-squares regression for 
outlier-free data. 


For LAD with IRLS method, M, and RRANK Regression 
Output is standard for all PLENGTH options. 


SAVE command 


SAVE filename 


Saves residuals, estimated values, and the variables from the model statement in the 
specified file. You can place one SAVE command before ESTIMATE. 


/ RESID saves robust residuals and predicted values. 
COEFF saves robust and least-squares regression coefficients. 
DATA saves data along with residuals and estimated values. 
WEIGHT saves weights used in weighted least-squares regression. 
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ESTIMATE command 


ESTIMATE 


Tells SYSTAT to estimate the analysis specified in MODEL. 


/ CONFI specifies the confidence coefficients for the confidence intervals for 
regression parameters. 
CUTOFF-K1 specifies cutoff point to detect outliers. 
TOL=k2 specifies tolerance for robust regression. 
ITER=n specifies maximum number of iterations for selected robust regression. 
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RSM: 
Response Surface Methods 


RSM module fits the specified response model and finds the optimum factor setting(s). 
Block effect may or may not be present in the design. You can fit up to a second order 
model. RSM provides three kinds of optimization techniques: Canonical Analysis, 
Ridge Analysis and Desirability Analysis. Canonical Analysis provides the nature of 
the optimal point. This technique is used to optimize response variables one by one. 
Ridge Analysis helps to determine the direction in which to look for the optimal 
response in case of saddle surface or when the stationary point is beyond the 
experimental region. Desirability analysis can be used to optimize in respect of several 
responses simultaneously. RSM provides desirability plots as Quick Graphs. 


Setup: 
* RSM 
ie USE 
d CUSTOM 
ы MODEL 
SAVE 
HOT * ESTIMATE 
HOT * CANONICAL 
HOT * RIDGE 
DESIRABILITY 
HOT * OPTIMIZE 
HOT * CONTOUR 
HOT * SURFACE 
CUSTOM command 


CUSTOM factor1, factor2... 


Specifies the factor(s) in the model. factor1, factor2, etc. specifies independent 
variables. 


/ BLOCK = variable declares the variable as block 
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Example: 


CUSTOM TEMPERATURE TIME/ BLOCK = MACHINE 


MODEL command 


MODEL responsevarlist- factor? + factor2 + factor1 *factor2 + factor1*factor1+... 


Specifies the response surface model. responsevarlist specifies the response variable(s). 
factor1, factor2, etc. specify independent variables. Specify interactions by linking 
variables with an asterisk and square terms by same terms with asterisks. 


If you want to fit the full second order model, you may not specify the whole model, 
just specify the responsevarlist. SYSTAT fit the full second order model of variables 


specified in the CUSTOM command by default. 


Examples: 


MODEL YIELD = TEMPERATURE + TIME + TEMPERATURE *TIME + TIME* TIME + 
TEMP * TEMP 


MODEL YIELD VISCOSITY MOLWEIGHT 


SAVE command 


SAVE filename 
Saves the specified statistics/results to a file. 


| COEF saves the estimates of the regression coefficients. 
RESID saves the residuals and the predicted values. This is the default option. 
DATA saves the residuals along with the data. 
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ESTIMATE command 


ESTIMATE 


Estimates the model parameters. It gives the ANOVA, and the 'Lack of Fit' test for each 
response variable separately. 


Example: 
MODEL YIELD VISCOSITY MOLWEIGHT 
ESTIMATE 


CANONICAL command 


CANONICAL responsevarlist 


Performs Canonical Analysis on the specified fitted response variable(s). If you do not 
specify the responsevarlist, Canonical Analysis will be done on all fitted responses. 


Examples: 


CANONICAL YIELD VISCOSITY MOLWEIGHT 
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RIDGE command 


RIDGE responsevarlist 


Performs Ridge Analysis on a specified fitted response variable. 


| AIM=MAX or MIN пеон pln 
RSTART the point from where the ridge starts. 
default is 0. The value should be in [0,5). 
REND porisee tepon point where the ridge ends. The default is 1. 
value should be in (0,5). 
RSTEP specifies he жонине ун duri ра 


is 0.1 The value should be less than or equal to 
(REND-RSTART). 


Examples: 


RIDGE YIELD 
RIDGE VISCOSITY / AM = MIN RSTART = 0 REND = 2 RSTEP = 0.2 


DESIRABILITY command 


DESIRABILITY response? = пит 


Performs Desirability Analysis on specified fitted response. num specifies the target 


value of the response. 
/ AIM=MAX or MIN or TARGET or specifies whether you want to maximize the pen 
RANGE or to minimize it or to target the or to keep it 
in а specified range. The default is TARGET. 
LOWER = / specifies the lower value of the response. 
UPPER =u specifies the upper value of the response. 
= ifies the weigh t for each They can be 
bees ет, 10]. The defaul tis 1. 


= ifies the importance for each response. They can 
ICONES ej real number in (0,10]. The "default i is 1. 
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OPTIMIZE command 


Optimizes the response(s) using desirability optimization technique. 
Examples: 
DESIRABILITY YIELD=82 / LOWER=79 AIM=MAX 


DESIRABILITY VISCOSITY=75 / LOWER=70 AIM=TARGET UPPER=90 WEGHT=1 


DESIRABILITY MOLWEIGHT / UPPER=4000 LOWER=2900 AIM=RANGE WEIGHT=2, 
IMPORTANCE=1 


OPTIMIZE 


CONTOUR command 


CONTOUR responsevar * factor1 * factor2 / factor3 = num, ... 


Produces contour plot for a response against any two factors . You can fix the levels of 
other factors. By default other factor levels are fixed at mid-value. 


SURFACE command 


SURFACE responsevar * factor? * factor2 / factor3 = num, ... 


Produces surface plot for a response against any two factors . You can fix the levels of 
other factors. By default other factor levels are fixed at mid-value. 


335 


SERIES; Time Series 


SERIES: 
Time Series 


Setup: 


CLEAR command 


SERIES implements a wide variety of time series models, including linear and 
nonlinear filtering, Fourier analysis, seasonal decomposition, nonseasonal and 
seasonal exponential smoothing, the Box-Jenkins approach to nonseasonal and 
seasonal ARIMA, and trend analysis. You can save results from transformations, 
smoothing, the deseasonalized series, and forecasts for use in other SYSTAT 
procedures. 

SERIES provides graphical displays for time series analysis (case plots, 
autocorrelations, partial autocorrelations, and cross-correlations), and includes 
features for differencing and transformations, smoothing, and seasonal decomposition 
and adjustment. 


* SERIES 
» "USE 
CLEAR 
MISSING 
SAVE 
TIME 
HOT TPLOT or ACF or PACF or CCF 
HOT DIFFERENCE or LOG or PCNTCHANGE or MEAN or 
SQUARE or TREND or INDEX or TAPER 


HOT SMOOTH or EXPONENTIAL or ADJSEASON or 
ARIMA or MKTEST or STEST or FOURIER 


CLEAR var 


Clears var from the active workspace. All transformations are “undone” and the 


original series is restored. 
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MISSING command 


MISSING INTERPOLATE | 
DELETE 


INTERPOLATE interpolates missing values using DWLS (Distance Weighted Least 
Squares). DELETE prevents interpolation and only the leading nonmissing values are 
retained for analysis. 


SAVE command 


SAVE filename 


Saves the series resulting from ACF, PACF, CCF, SMOOTH, LOWESS, EXPONENTIAL, 
ADJSEASON, ARIMA, FOURIER or transformations to a file. 


Graphical displays 


TIME command 


TIME n1, n2, n3 


Labels points in a series plot. The year of the first observation in the series is n1. The 
periodicity is n2 and n3 indicates the period of the first observation. 


| FORMAT-' date format’ displays date format for values on the X-axis (time axis). 


TPLOT command 


TPLOT var 


Plots data values of var. The variable you select is the dependent (vertical axis) 
variable, and the CASE is the independent variable. The points are connected with a 
line. 
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ACF command 


ACF var 


Graphs the autocorrelation function of var. The first numeric variable is the default. 


|LAG-n number of lags to display. 


PACF command 


PACF var 
Graphs the partial autocorrelation function of var. The first numeric variable is the 
default. 


| LAG=n number of cases lagged before calculating the autocorrelation. 


CCF command 


CCF var1, var2 


Graphs the cross-correlation of the two numeric variables varf, var2. You must specify 


both variables. 
s as well as LAG = 0. If n is an even 


/|LAG-n produces [(n/2)1-1] positive and negative lag j-0. 
number, one more negative lag is calculated. n must be a positive integer. 
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Transformations 


DIFFERENCE command 


DIFFERENCE var 
Transforms the data into differences between successive observations. 


/LAG=n the seasonal period for the difference calculations. n must be a positive integer. 
The default value is 1. 


INDEX command 


INDEX var 


Replaces each value of var with the ratio of that value to the base observation. The 
default base is the first observation. 


Г BASE=n base value, the default value is 1. 


LOG command 


LOG var 


Transforms the values of var into their natural logs. 


MEAN command 


MEAN var 


Subtracts the mean of var from each observation. 
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PCNTCHANGE command 


PCNTCHANGE var 
Replaces each observation of var with the percentage change from the previous 


observation. 


SQUARE command 


SQUARE var 


Squares each observation of var. 


TAPER command 


TAPER var 


Tapers var using split-cosine-bell. 


|Pzn proportion of the series to be within the tapering window, the default value is 0.5. 


TREND command 


TREND var 


Removes the linear trend from var. 
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Smoothing 


SMOOTH command 


SMOOTH var 
Smooths a series. 


/ LOWESS=n Cleveland’s smoother with smoothness parameter n, 0 <= п <= |. Speci- 
fies the smoothness of the function. As П increases, the function becomes 
smoother (tighter). The default value is 0.5. 


MEAN=n running means, where n is the number of neighbors to include in the 
smoothed mean at a given point along the abscissa. The default value is 3. 
MEDIAN=n running medians, where л is the number of neighbors to include in the 


smoothed median at a given point along the abscissa, 


WT=n1,n2,n3,... weights to assign the neighbors of the data being smoothed, specify an n 
for each neighbor included. 


EXPONENTIAL command 


EXPONENTIAL var 
Specifies an exponential smoothing model. 


/ ADDITIVE=n additive seasonal term, 0< = п <]. 


FORECAST=n ог number of forecasts ога range of forecasts, where n specifies the 
a..b number of forecasts to follow the last case in the file, or use a and b 
to specify the range of cases, 


LINEAR=n linear growth. 0< = n <], 
MULTIPLICATIVE=n multiplicative seasonal term, 0< = n < I: 
PERCENTAGE=n percentage growth. 


SEASON=n seasonal periodicity, where n must be a positive integer. The default 
value is 12, 


SMOOTH=n weight for level term, 0< = n <1, the default value is 0.1. 
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Seasonal decomposition and adjustment 


ADJSEASON command 


ADJSEASON var 


Computes seasonal decomposition and adjustment on var by ratio-to-moving-averages 
method. The first numeric variable is used by default. 


/ ADDITIVE specifies an additive or multiplicative term. 
MULTIPLICATIVE 
SEASON=n seasonal periodicity, the default value is 12. 
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Box-Jenkins ARIMA model 


ARIMA command 


ARIMA var 


Estimates the Box-Jenkins ARIMA model. The default var is the first numeric variable 


in the file. 


/ P=n 
PS=n 
Q=n 
QS=n 
SEASON=n 
CONSTANT 


BACKCAST=n 
ITER=n 


CONV=n 


FORECAST=n or 
time1..time2 


number of AR parameters, the default value is 1. 
number of seasonal AR parameters 

number of MA parameters 

number of seasonal MA parameters 

seasonal periodicity, the default value is 1. 


includes a constant in the ARIMA model. The constant 
is the sample mean. 


number of backcasts, n must be a positive integer. The 
default value is 0. 


maximum number of iterations allowed to fit your 
model. The default value is 20. 


how closely the fitted values must match the actual 
values before estimates are considered converged, 
n> 0. The default value is 0.01. 


number of forecasts (n) or the time periods to forecast 
(time, time2), all must be positive integers. 
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Trend Analysis 


MKTEST command 


MKTEST seriesvar^timevar 


Computes the Mann-Kendall test for the seriesvar, where the time variable is specified 


by timevar. 
/ ALT= UPWARD alternative type of the test, the default option is UPWARD 
DOWNWARD 
TWOSIDED 
SLOPE produces Sen's slope estimator 
CONFI-u displays the confidence interval for Sen's slope estimator at desired 


level of confidence. The default value is 0.95. 


STEST command 


STEST seriesvar*timevar 


Computes à Seasonal trend test for the seriesvar, where time variable is specified by 


timevar. 
| SEASON=season specifies a seasonal variable 
TEST= SK performs Seasonal Kendall test 
MSK performs Modified Seasonal Kendall test 
HT performs Seasonal Homogeneity test 


ALT= UPWARD. alternative type of the test, the default option is UPWARD. 


DOWNWARD 
TWOSIDED 
SLOPE produces slope estimator for SK and MSK tests 
1 of the Seasonal Kendall and Modi- 


isplays the confidence interva ] 
fied Seasonal Kendall slope estimators at desired level of confi- 


dence. The default value is 0.95. 


CONFI-u 
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Fourier analysis 


FOURIER command 


FOURIER varlist 


Computes Fourier decomposition. 


/LAG=n first n cases. 
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SETCOR: 
Set and Canonical Correlations 


SETCOR computes set correlations and canonical correlations. Set correlation (SC) is 
a generalization of simple and multiple correlation. In its standard form, it generalizes 
bivariate and multiple regression to their multivariate analogue. The standard 
univariate and multivariate methods provided by the SYSTAT GLM module (for 
example, multivariate analysis of variance and covariance, discriminant function 
analysis) may be viewed as special cases of SC. SC thus provides a single general 
framework for the study of association. In contrast to canonical correlation, it yields a 
partitioning of variance in terms of the original variables, rather than their canonical 


transformations. 


Setup: 
* SETCOR 
USE 
* MODEL 
CATEGORY 
ERROR 
PLENGTH 
HOT * ESTIMATE 
MODEL command 


MODEL yvarlist  xvarlist ' | 
Waist | ypartials = xvarlist | xpartials 


relation model or a canonical correlation model. The simple 


Specifies a set co | à 
partial variable lists. 


canonical correlation model has no 
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CATEGORY command 


CATEGORY grpvarlist 


Specifies numeric or string grouping variables to define cells. All subsequent contrasts 
assume the sorted ordering of the levels. 


/ MISS adds a category for cases with a missing value for the categorical variable in the 
analysis. 


Example: 


CAT SEX$ EDUCATN 


ERROR command 


ERROR varlist 


Specifies a set of variables to be used in computing error terms for statistical tests. 


PLENGTH command 


PLENGTH SHORT 
LONG 


Controls the amount of output reported. To request extended results, enter PLENGTH 
prior to ESTIMATE. Select one of two categories of output: 
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SHORT 


LONG 


ESTIMATE command 


SETCOR: Set and Canonical Correlations 


the type of association, the variables in sets YPARTIAL, XPARTIAL, and 
G (when present), the Rao F (with its df and p-value), and their shrunken 
values, and the following results for the basic y and x variables: the within- 
set correlation matrices for the y and x variables, the rectangular between- 
set correlation matrix, the betas for estimating each y variable from the x 
set (with their standard errors and p-values), а matrix of the intercorrela- 
tions of the estimated y values whose diagonal is the multiple of each y 
variable with the x set, and the F test and p-value for the latter. 


PLENGTH SHORT plus results for basic y and x, the Stewart and Love 
redundancy index for y given xb, the canonical correlations and their Bar- 
tlett chi-square tests, and the canonical coefficients, loadings, and redun- 
dancies for both sets.In addition, for the option ROTATE rotates the 
dependent and independent canonical loadings and the canonical correla- 


tions. 


ESTIMATE 


Initiates estimation. 


| N=n 
ROTATE=n 


required when using a correlation matrix instead of raw data. 
rotates п factors by varimax rotation. 
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SIGNAL: 
Signal Detection Analysis 


The SIGNAL module provides analyses of data that are appropriate for the theory of 
signal detection. The response data to be analyzed may consist of 2 to 11 response 
categories. Thus, either binary or rating scale data may be analyzed. An iterative 
technique is used in order to produce maximum likelihood estimates of all model 
parameters including the locations of the category boundaries. Graphical displays of 
ROC curves are available in addition to the numerical output. 


Setup: 


* SIGNAL 
ae USE 
* MODEL 
SAVE 
HOT * ESTIMATE 


MODEL command 


MODEL responsevariist = stimulusvar 


Specifies the stimulus and response variables. 


SAVE command 


SAVE filename 
Saves the required statistics prior to ESTIMATE command to a file. 


/ ROC saves ROC curve coordinates 
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SIGNAL: Signal Detection Analysis 


ESTIMATE 


Initiates the computation of the model that you have selected. Typing ESTIMATE again 
after an analysis has finished will cause SIGNAL to use the most recent estimates of 
parameters as starting values for continuing the analysis, rather than computing new 


ones. 
/ NORMAL 


NPAR 
LOGISTIC 
EXPONENTIAL 
CHISQUARE 
POISSON 
GAMMA 
ITERATIONS=n 


CONVERGE=d 


MO=d 
R=d 
C=d 


MEAN=d 
DF=d 
START 


indicates that the noise and signal plus noise distributions are 
Gaussian. 

indicates that a nonparametric model is to be used. 

for a logistic model. 

fits an exponential model. 

fits a chi-square model. 

fits a Poisson model. 

fits a gamma model. 


controls the maximum number of iterations that you want to allow the 
program to perform in order to estimate the parameters. The default is 
50. 


sets the convergence criterion. This is the largest relative change in 


any coordinate before iterations terminate. The default is 0.001. 

MO parameter for gamma model. 

gamma parameter (mean is РУМО). The default is 5. 

scaling constant. For logistics model, default is 1.814 and for exponen- 
tial model, default is 1. 

mean for Poisson model. 

degrees of freedom for chi-square model. The default is 10. 

uses the specified values of DF, MEAN or R as starting values for esti- 
mation instead of keeping them fix. 
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SMOOTH: 
Smoothing Data 


SMOOTH implements 126 nonparametric smoothers, including local polynomial 
smoothers, kernel regression smoothers, moving average smoothers (running means), 
running median smoothers, robust smoothers such as Cleveland's LOESS, distance- 
weighted-least-squares smoothers (DWLS), linear interpolators, step function 
smoothers, inverse distance smoothing (Shepard's function), and negative exponential 
smoothing (NEXPO). All smoothers are implemented in either two or three 
dimensions. SMOOTH estimates values at either specified gridpoints or at the predictor 
data values, saving the smoothed estimates to SYSTAT files. 

Cleveland's LOESS (1988) smoother replaces his older LOWESS algorithm (1979). 
Consequently, results from the LOWESS smoother in scatterplots, SPLOMs, 
probability plots, and quantile plots differ slightly from those with LOESS in SMOOTH. 
In general, the LOESS output is preferred over LOWESS. 


Setup: 
* SMOOTH 
i USE 
A MODEL 
SAVE 
HOF + ESTIMATE 
MODEL command 


MODEL depvar = indvar 

or 

MODEL depvar = indvar1 indvar2 

Identifies the model to be fit. For two-dimensional smoothers, specify a dependent 


variable and an independent variable. For three-dimensional smoothers, specify a 
dependent variable and two independent variables. 
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SAVE filename 


Saves results to a file. If the number of grid points equals zero, the saved file includes 
the dependent and independent variables, the estimates, and the residuals (dependent 
variable minus smooth estimate). Otherwise, the file contains the dependent and 
independent variables, grid coordinates, and smooth estimate. 


ESTIMATE command 


ESTIMATE 


Defines 


/ 


WINDOW = FIXED 
KNN 


KERNEL = 
EPANECHNIKOV 


BIWEIGHT 


TRICUBE 


TRIWEIGHT 


GAUSSIAN 

CAUCHY 

UNIFORM 
SMOOTHER = 


MEAN 


the smoother and initiates the estimation process. 


the region containing the points to include in a smooth esti- 
mate. FIXED uses a global fixed-width window around each 
point (default). KNN selects a set number of points that are 
closest to each point. 

the function used for weighting cases. Select one of seven 
kernel functions. 

Epanechnikov kernel (default). Weights decrease as distance 
from current point increases. 

biweight kernel. More peaked than the Epanechnikov kernel, 
this function allows higher weighting for points which are 
moderately close to the current point. 

tricube kernel. More peaked than the biweight kernel, this 
function allows higher weighting for points which are near 
the current point. 

triweight kernel. The most peaked kernel function, assigning 
very high weights to points very near the current point. 
Extreme cases receive very little weight. 

Gaussian kernel. Assigns weights according to a normal dis- 
tribution. 

Cauchy kernel. Assigns weights according to a Cauchy dis- 
tribution. 

uniform, boxcar, or square window kernel. All cases receive 
identical weights. 

the method used to combine the weighted observations into a 
smooth estimate. Select one of five smoothing methods. 


moving average of points in region 
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TRIM 


MEDIAN 

POLY 

ROBUST 
DEGREE =n 


BANDWIDTH = d 


NEIGHBORS = n 


PROPORTION = p 


GRID = п 


CANONICAL 


LOESS 


mean of the points after discarding the most extreme 50% in 
the current region 


running median of points in region 
polynomial regression function (default) 
robust polynomial regression function 


for the polynomial and robust smoothing methods, the 
degree of the polynomial as 1 (default), 2, or 3. 


for WINDOW = FIXED, the bandwidth of the global window 
in data units (d > 0). The default bandwidth depends on the 
kernel function and the variability of the data. 


for WINDOW= KNN, the number of data points used for 
each smooth estimate (n > 0). 


the size of the region used for each smooth estimate as a pro- 
portion (0 < p <= 1). For WINDOW = FIXED, p is the pro- 
portion of the largest range for the independent variables, so 
the BANDWIDTH equals 0.5*p*(range). For WINDOW = 
KNN, p is the proportion of the number of cases, so NEIGH- 
BORS = p*N. BANDWIDTH or NEIGHBORS override 
PROPORTION when they are specified. 


the number of grid points for 2D or 3D smoothing 
(0 <= n <= 100). The default is 25. 


for WINDOW - FIXED, Marron and Nolan's canonical scal- 
ing n which normalizes window widths for different 
ernels. 


Cleveland's LOESS smoother. This is equivalent to: 
WINDOW = KNN, SMOOTHER = ROBUST, PROPOR- 
TION = .5, and KERNEL = TRICUBE. By default, 
DEGREE=1 for the robust polynomial. 
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SPATIAL: 
Spatial Statistics 


SPATIAL computes a variety of statistics on a 2-D or 3-D spatially oriented data set. 
Variograms assist in the identification of spatial models. Kriging offers 2-D or 3-D 
smoothing and variance estimates based on a spatial covariance model. Simulation 
realizes a spatial model using Monte Carlo methods. Finally, a variety of point-based 


The geostatistical routines in SYSTAT SPATIAL are 
Journel, 1997). Point statistics are computed from a Voronoi/Delaunay partition of 2-D 


or 3-D configurations. 


Setup: 


* SPATIAL 

* “USE 

* MODEL 
TREND 
GRID 


SAVE 
HOT * VARIOGRAM or KRIG or SIMULATE or POINT 


MODEL command 13 


MODEL var = varlist 
MODEL specifies a spatial model to be fit by kriging or simulation. Nested structures 
are expressed by using slashes up to three times and specifying the optional arguments 
separately for each structure, all in one statement. 
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/ NUGGET=d 
SILL=d 


ANG1=d 
ANG2=d 
ANG3=d 


AHMAX=d 
AHMIN=d 
AVERT=d 


TYPE=SPHERICAL 
EXPONENTIAL 
GAUSSIAN 
POWER 
HOLE 

/ (repeat options) 


/ (repeat options) 


Example: 


Raises the height of the entire semivariogram by the specified 
value. The default value is 0. 


Specifies the maximum value on the ordinate axis for the 
function modeling the semivariogram. The default value is 0. 


Specifies the angles (in degrees) for rotation. ANG1 equals the 
deviation from north in a clockwise direction. ANG2 equals the 
deviation from horizontal (for 3D models). ANG3 equals the tilt 
angle. The default values for ANG1, ANG2 and ANG3 are 0. 


Specifies the shape of the ellipse that comprises a level curve for 
a given distance calculation: AHMAX is the maximum extent, 
AHMIN is the minimum extent, and AVERT is the 3-D (vertical) 
extent. For power models, AHMAX denotes the exponent 


(owes). The default values for AHMAX, AHMIN and AVERT 
are 1. 


Indicates the form of the variogram model. The default is 
SPHERICAL. 


Defines any of the options above (with the exception of 
NUGGET) for the second structure of nested models. 
Defines any of the options above (with the exception of 
NUGGET) for the third structure of nested models. 


MODEL Y = LONGITUD LATITUD / NUGGET=.06 SILL=.03 TYPE=SPHERICAL, 


/ SILL=.04 TYPE-SPHERICAL 


This example defines two nested structures, both spherical, differing in sills. 


TREND command 


TREND xvar + yvar + zvar + ; 
xvar*xvar + yvar*yvar + zvar*zvar + Р 
xvar*yvar + xvar*zvar + yvar*zvar 


TREND specifies trend components for the kriging model. Trend models can use any 
ordering of the three spatial variables. For two-dimensional spatial modeling, the total 
number of combinations is 5. For three-dimensional modeling, the total number is 9. 
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GRID command 


GRID 


Defines a grid for sampling or fitting values. 


| XMIN=d Sets the minimum and maximum along the X axis. 
XMAX=d The default values for XMIN and XMAX are both 0. 
YMIN=d Sets the minimum and maximum along the Y axis. 
YMAX=d The default values for YMIN and YMAX are both 0. 
ZMIN=d Sets the minimum and maximum along the Z axis. 
ZMAX=d The default values for ZMIN and ZMAX are both 0. 
NX=n Specifies the number of nodes along the X axis. The default value is 10. 
NY=n Specifies the number of nodes along the Y axis. The default value is 10. 
NZ=n Specifies the number of nodes along the Z axis. The default value is 10. 
SAVE command 
SAVE filename 


Saves the results from the next SPATIAL hot command to the designated file. 
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VARIOGRAM command 


VARIOGRAM 


The VARIOGRAM statement produces variograms. 


/ NLAG=n 
XLAG=d 
XLTOL=d 
AZM=d 


ATOL=d 


BANDH=d 


DIP=d 


DTOL=d 
BANDV=d 


TYPE=SEMI 
COVARIANCE 
CORRELOGRAM 
GENERAL 
PAIRWISE 
LOG 
MADOGRAM 


Indicates the number of lags. The default value is 10. 
Defines the separation distance between lags. 
Defines the tolerance for lags. 


Indicates the angle (in degrees from the North axis) at which the 
variogram is to be computed. The default value is 0. 

Defines the amount of tapering (in degrees) near the ori gin. Val- 
ues exceeding 90 degrees yield an omnidirectional variogram. 
The default value is 90, 

Defines the half-width of the region used to compute the 
variogram. 

Used for three-dimensional spaces. Along with AZM, defines 
the angle (in degrees from the Depth axis) at which the 
variogram is to be computed. The default value is 0. 

Used for three-dimensional spaces. Defines the amount of taper- 
ing (in degrees) along the near the origin. The default value is 0. 
Used for three-dimensional spaces. Along with BANDH, 
defines the area of the Tegion used to compute the variogram. 


Specifies the type of variogram created, Default is SEMI. 
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KRIG 


The KRIG statement is for fitting a surface to spatial data via kriging. 


/ NXDIS=n 
NYDIS=n 
NZDIS=n 
NDMIN=n 


NDMAX=n 
RADMIN=d 


RADMAX=d 
RADVER=d 
SANG1=d 
SANG2=d 
SANG3=d 


SKMEAN-d 

TREND 

TYPE=SIMPLE 
ORDINARY 


GRAPH=CONTOUR 
TILE 
SURFACE 


Specifies the number of discretization points in theX (North) 
direction. The default value is 1. 

Specifies the number of discretization points in the Y (East) 
direction. The default value is 1. 

Specifies the number of discretization points in the Z (Depth 
direction. The default value is 1. ^ 
Specifies the minimum number of points included in an estimate. 
The default value is 2. 

Specifies the maximum number of points included in an estimate. 
Specifies the minimum horizontal direction for the search radius. 
The default value is 1. 

Specifies the maximum horizontal direction for the search radius. 
The default value is 1. 

Specifies the vertical direction for the search radius. The default 
value is 1. 

Defines the angle (in degrees) of the major axis of the search 
ellipsoid. The default value is 0. 

Defines the angle (in degrees) of the first minor axis of the search 
ellipsoid. The default value is 0. 

Defines the angle (in degrees) of the second minor axis of the 
search ellipsoid. The default value is 0. 

Specifies the simple kriging mean. The default value is 0. 


Includes trends in the kriging estimation. 

SIMPLE kriging assumes the prediction model contains a 
stationary mean. ORDINARY kriging adapts the fit to local 
trends. The default method is ORDINARY. Using 
TYPE=ORDINARY and TREND yields universal kriging. 
Kriging results can be displayed in a contour plot (CONTOUR), a 
mosaic plot (TILE), or as three-dimensional surface (SURFACE). 
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SIMULATE command 


SIMULATE 
The SIMULATE command simulates a model using a Gaussian distribution. 


/ GRAPH=CONTOUR Simulation results can be displayed in a contour plot 
TILE (CONTOUR), a mosaic plot (TILE), or as three-dimensional sur- 
SURFACE face (SURFACE). 


POINT command 
SSS Sata i АГИ КЕС а o — o Ж 


POINT varlist 


POINT produces areas of Voronoi polygons, nearest-neighbor distances, counts of 
polygon facets, and quadrat counts for scattered points in a 2-D or 3-D space. varlist 


contains two arguments for two-dimensional distributions, and three for three- 
dimensional. 
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Basic Statistics 


Setup: 


Basic Statistics has the usual mean, standard deviation, standard error, etc., that are 
appropriate for data that follow a normal distribution. It also has a stem-and-leaf plot 
for assessing distributional shape and identifying outliers. 

In Basic Statistics, you can save a file of aggregate statistics. For example, suppose 
your original file has exam scores for 300 students from 10 schools, create a new file 
containing 10 records (one per school) with average math score, average verbal score, 
maximum math score, verbal standard deviation, etc., for the students in each school. 


Note: STATS is now made as one of the Global modules. That is, this command is 
obsolete. You can use CSTATISTICS or RSTATISTICS command within any other 


modules. 


* USE 
SSAVE 
SAMPLE 
CSTATISTICS ог RSTATISTICS or CLSTEM or RWSTEM or CRON- 
HOT BACH or MNTEST 


CSTATISTICS command 


CSTATISTICS no argument computes specified statistics for all numeric variable(s) 
in the data file. 
rlist or / ROWS computes s; ified statistics for variable(s) in the varlist 
picky = rowlist if varlist is Specified. If you specify rowlist, SYSTAT 
computes specified statistics for all numeric variable(s) 
in the data file, selecting number of case(s) in the rowlist. 


rlist / ROWS = computes specified statistics for variable(s) in the varlist, 
OSA TNI rowlist ^ selecting number of case(s) in the rowlist. 


CSTATISTICS 


360 


Basic Statistics 


If you specify options, only those that you specify are calculated. The default is MEAN, 
MINIMUM, MAXIMUM, SD, and N. The following options are available: 


Common Options 


/ ALL calculates all statistics except N-tiles and P-tiles. 

N number of cases. 

MIN smallest value. 

MAX largest value. 

SUM total of the observations in the sample. 

MEAN average value. 

SEM standard error of the mean. 

CIM upper and lower endpoints of the 95% confidence interval for 

e mean. 

CONFI-n level of confidence for mean (0 < n < l, can also be expressed 
in terms of percentage). 

GMEAN geometric mean of positive non-missing observations. 

HMEAN harmonic mean of positive non-missing observations. 

TRMEAN=p proportion of trimming for trimmed mean [(0 < p < 1) for lower 
a emper, (0 « p « 0.5) for two-sided trimming]. The default value 
5 0.10. 


TREGION=TWOSIDED trimming region (i.e. whether observations only from lower or 
UPPER upper or two sides) to be trimmed out. 


LOWER 

MEDIAN middle observation when the data are ordered. 

SD standard deviation. 

CV coefficient of variation (standard deviation divided by the mean). 

RANGE difference between the minimum and maximum values. 

VARIANCE Square of the standard deviation. 

SKEWNESS measure of symmetry of the sample distribution (G1). 

SES standard error of skewness, 

KURTOSIS measure of peakedness of the sample distribution (G2). 

SEK standard error of kurtosis. 

SWTEST Shapiro-Wilk normality test statistic along with p-value. 

ADTEST Anderson-Darling normality test statistic along with p-value. 

MSKEWNESS Mardia's skewness coefficient along with a test statistic and p-value 
using an asymptotic distribution. 

MKURTOSIS Mardia's kurtosis coefficient along with a test statistic and p-value 
using an asymptotic distribution. 

HZTEST Henze-Zirkler test statistic and associated p-value using lognormal 


distribution. 


361 


Basic Statistics 


N-tiles and P-tiles 


NTILE=N value that divides a m ac data into N classes containing 
(as far as possible) c number of observations, by 
providing №-1 cut-off points. 

PTILE=p1,p2... the PTILE for percent p1 is a number x such that a percentage 
р1 of the observations is less than or equal to X. Specify the 
values separated by comma or to request more than one 

tile at a time. By default SYSTAT computes 1, 5, 10, 
20, 25, 30, 40, 50, 60, 70, 75, 80, 90, 95, 99 percentiles. 
METHOD = ALL computes N-tiles and P-tiles for specified method. You are 
CLEVELAND allowed to select more than one method at a time. ALL 
WTDAVG1 computes the N-tiles and P-tiles for all seven „Тһе 
CLOSEST default method is CLEVELAND. 


WTD 
EMPCDFAVG 
WTDAVG3 


Classification with N-tiles: 


The following options are available when N-tiles are requested: 


CLASSIFY = varlist classifies the value(s) of variable(s) rop in the varlist 
into one of the N classes defined by the N-tiles 
DATA=y11 y12 y13 ...; Classifies the specified values into one classes of the N 
21 y22 y23 ...; classes defined by the N-tiles. Separate the data for more 
y31 y32 y33 ...; than one variable by semicolon(;). 


Suppose you have requested Basic statistics for k variable(s), the CLASSIFY oe 
greater than k variable(s). Then only the first k in the CLASSIFY list will be considered. 


The same is true for DATA 


Example: 
S POP. 1983 POP. 1986 / ROWS = ROW(1).. ROW(20) 


OP. 1986 / ROWS = ROW(20).. ROW(1) NTILE= 4. 
CSTATISTICS POP_1983 POP_ OY C POP 1983 o 122.8 


CSTATISTIC 
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RSTATISTICS command 


RSTATISTICS no argument computes specified statistics for all case(s) in the data 
file. 


RSTATISTICS rowlist or/ COLUMNS computes specified statistics for all case(s) in the data 
= varlist file selecting variable(s) in the varlist if varlist is 
specified. If you specify rowlist, SYSTAT computes 
specified statistics for case(s) in the rowlist, selecting 
all numeric variable(s) in the data file. 


RSTATISTICS rowlist / COLUMNS = computes specified statistics for variable(s) in the 
varlist varlist, selecting number of case(s) in the rowlist. 


If you specify options, only those that you specify are calculated. The default is MEAN, 
MINIMUM, MAXIMUM, SUM, and N. The options available here are same as in the 
CSTATISTICS command. The following commands need some changes as here you 
request Row Statistics: 


Classification with N-tiles 


CLASSIFY = varlist classifies the value(s) of row(s) specified in the varlist into one of 
the N classes defined by the N-tiles. 
DATA-y11 y12y13..; classifies the specified values into one of the N-tile classes. 
y21y22y23..; Separate the data for more than one case by semicolon(;) 


y31 y32 y33 ...; ... 
CLSTEM command 
CLSTEM  noargument displays a stem-and-leaf plot for all numeric 
variable(s) in the data file. 
CLSTEM varlist or / ROWS = displays a stem-and-leaf plot for variable(s) in the 
rowlist varlist if varlist is specified. If you specify rowlist, 


SYSTAT displays a stem-and-leaf plot for all numeric 
variable(s) in the data file, selecting number of 
case(s) in the rowlist. 


CLSTEM varlist/ ROWS = rowlist displays a stem-and-leaf plot for all variable(s) in the 
varlist, selecting number of case(s) in the rowlist. 


You can specify number of lines through the following option: 


/ LINES=n number of lines used for the diagram. 


RWSTEM command 


RWSTEM no argument 
RWSTEM rowlist or/ COLUMNS = 
varlist 
RWSTEM rowlist / COLUMNS = 
varlist 


The option available here is the s 


CRONBACH command 


CRONBACH varlist 


Computes Cronbach's alpha. 


MNTEST command 


MNTEST varlist 


MNTEST assesses the marginal normali 


Wilk test, if sample size is less than 


kurtosis coe 
Henze-Zirkler test 


also displayed. Finally, 
distances. 


Smirnov test with estimated parameters) 
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displays а stem-and-leaf plot for all numeric case(s) in 


the data file. 

displays а stem-and-leaf plot for all case(s) in the data 
file, selecting variable(s) in the varlist if varlist is speci- 
fied. If you specify rowlist, SYSTAT displays a stem- 
and-leaf plot for case(s) in the rowlist, selecting all 
numeric variable(s) in the data file. 


displays a stem-and-leaf plot for all case(s) in the 
rowlist, selecting variable(s) in the varlist. 


ame as in CLSTEM. 


ty for each variable in varlist, using the Shapiro- 
equal to 5000; otherwise, it uses the Lilliefors 
‚ Further, it computes 
ficients for the variables in varlist and performs a 
ing an asymptotic distribution. The 
lognormal distribution are 
aled squared Mahalanobis 


or 
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SAMPLE command 


SAMPLE BOOT(m,n) 
SIMPLE(m,n) 
JACK 


The argument m is the number of samples; the argument n is the size of each sample. 


For getting the summarized resampling output, the above command should be given 
before CSTATISTICS (or RSTATISTICS) command. 


The parameters for which resampling summary is desired, should be listed in the 
SAMPLE command as options. 


/ CONFI = c specifies a confidence level for bootstrap-based confidence interval. The 
default value is 0.95. 


MEAN average value 
MEDIAN middle observation when the data are ordered 
SD standard deviation 


VARIANCE square of standard deviation 
SKEWNESS measures the symmetry of the sample distribution. 
KURTOSIS measures the peakedness of the sample distribution. 
Example: 
SAMPLE BOOT(500,50) / MEAN MEDIAN 
CSTATISTICS INCOME 


SSAVE command 


SSAVE filename 


Saves the statistics to a file. For N-tiles and P-tiles, if more than one method is chosen, 
SYSTAT saves the statistics for the very first method requested. Use with the BY 
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groups feature to save summary records for each unique combination of levels of the 
specified grouping variables. 


1 VARIABLES saves selected statistics to a data file. Each selected statistic is a case/variable 
in the new data file corresponding to column or row statistics. 


AG writes all aggregate statistics as one record. If omitted, each statistic (for each 
combination of levels) forms a separate record. 
MAHAL saves data and squared Mahalanobis distances. This is active only 
for the MNTEST subcommand. 


Example: 
BY SEX$ DOSE 


SSAVE AGSTATS / AG 
CSTATISTICS INCOME AGE SIDEFFCT / MEAN MAX SEM 


If SEX$ has two levels (male and female) and DOSE has four levels (say, 0. 6.25,12.5, 
and 25), eight records are saved with the mean, maximum value, and standard error of 


the mean for each variable. The variable names are 
MEIINCOME, MAIINCOME, SEIINCOME, NU2AGE, ME2AGE, 


SE2AGE, NU3SIDEF, MESSIDEF, MA3DEF, and SE3SIDEF. If AG is omitted from 
32 (8x 4) records. 


the example, the file AGSTATS would have 
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SURVIVAL: 
Survival Analysis 


SURVIVAL can be used to explore grouped, right-censored, and interval-censored 
survival data and to estimate nonparametric, partially parametric, and fully parametric 
models by maximum likelihood. SURVIVAL can handle disjoint and overlapping 
interval-censored data and combinations of interval censoring, right censoring, and 
exact failure times. The facilities provided in SURVIVAL include the Kaplan-Meier 
estimator, Turnbull’s generalization of the Kaplan-Meier estimator for interval- 
censored data, Nelson-Aalen cumulative hazard estimator, plots of failure and 
censoring times, quantile plots for standardized reference distributions, Cox-Snell 
residual plots for Cox and parametric models, log-rank tests, the proportional hazards 
(Cox) regression, and the Weibull, log-normal, log-logistic, and exponential regression 
models. All models may be estimated with or without covariates, either directly or by 
stepwise regression procedures. Akaike and Bayesian information criteria 

(AIC, Schwarz’s BIC) values are provided for each fitted model. For more information 
on AIC and Schwarz’s BIC in SYSTAT refer Chapter Linear Models, “Variable 
Selection” in Statistics П. The Kaplan-Meier estimator, quantile plots, and Cox 
regression all permit stratification. The survivor function, hazard function, reliabilities, 
and quantiles may be generated from parametric models for specific covariate values, 
and the baseline hazards may be derived from the Cox and stratified Cox models. The 
results of most analytic techniques may be saved into SYSTAT files for further 
manipulation and analysis with other SYSTAT modules. 
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Setup: 


* SURVIVAL 

USE 

* MODEL 
PLENGTH 
FUNPAR 
SAVE 

HOT * ESTIMATE or START 

STEP 
(series of steps) 
STOP 
SAVE 
LTAB 
NAHAZARD 
QNTL 
RELIABILITY 
HAZARD 
ACT 


MODEL command 


MODEL time = covarlist | dcovarlist 


Specifies a model to be estimated. If there are no time-dependent covariates, 


vertical bar (|). 


| CENSOR=varname specifies censoring variables 


LOWER=varname specifies lower- 
STRATA=varname specifies strata variables 


omit the 


bound variables, which are used for interval censoring 
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PLENGTH command 


PLENGTH SHORT 
LONG 


Controls the amount of output reported for Cox, regression and parametric models. 
Select one of two categories of output: 


SHORT input data information, covariate means, iteration history, parameter 
estimates, standard error, z-statistics, p-values. 
LONG SHORT plus score test results, 95% confidence intervals, covariance and 


correlation matrices of parameters. 


FUNPAR command 


FUNPAR tdcovar = expr 


Specifies time-dependent covariates. There is one FUNPAR statement for each time- 
dependent covariate. 


Example: 


FUNPAR X =GROUP*(LOG(DAYS)-5.4) 
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SURVIVAL: Survival Analysis 


ESTIMATE 


Runs the problem 


/ COX Cox regression, Default. 
LGST log-logistic model 
EXP exponential model 
EEXP extreme value exponential model 
WB Weibull model 
EW extreme value Weibull model 
LNOR log-normal model 
START=d1, d2, ... starting values 
TOLERANCE-d tolerance value. The default value is 1.0E-12. 
CONVERGE-d  setsthe convergence criterion. This is the largest relative change in any 
coordinate before iterations terminate. The default value is 
1.0E-6. 
START command 
START 


Begins estimation on stepwise regression. 


/ COX 
LGST 
EXP 
EEXP 
WB 


Cox regression. Default. 
log-logistic model 

exponential model 

extreme value exponential model 
Weibull model 
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EW extreme value Weibull model. 

LNOR log-normal model. 

BACKWARD initializes backward stepping. 

FORWARD initializes forward stepping. SYSTAT reports results for step 0. If 
omitted, all variables in MODEL that pass ENTER/ REMOVE limits are 
entered or removed from the model. 

ENTER=p probability value for variable to enter equation in a step. The default 
value is 0.15. 

REMOVE=p probability value for variable to leave equation in a step. The default 
value is 0.15. 

FORCE=n forces the first n variables in MODEL into the model. The default value 
is 0. 

MAXSTEP-d maximum number of steps to take. The default value is 25. 

TOLERANCE=d tolerance value. 

CONVERGE=d sets the convergence criterion. This is the largest relative change in any 
coordinate before iterations terminate. 

STEP command 
STEP var 
+ 


Initiates stepwise model building. 


/ AUTO initiates automated stepwise regression. 


STOP command 


STOP 


Stops stepping and prints final model. 
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SAVE command 


SAVE filename. 


Before Estimate. Saves the input data after processing any lower bound variable and 
censoring variable values as explained previously. 


/ RESIDUALS m case of Accelrated Failure Time models, this saves the regression residu- 
als. 


Before other HOT commands. 


Saves the tables generated by HOT command given. 


LTAB command 


LTAB 


Requests calculation and plotting of K-M probabilities, their standard errors and 


confidence intervals in case of nonparametric estimation. Requests life tables, hazard 


rates and standard errors in case of Cox and parametric models. 


ITLOG log time axis 
covar1=d1, covar2=d2... specifies fixed values on the covariates over which tables are 
produced 
CHAZ cumulative hazard on y axis 
LCHAZ log cumulative hazard on y axis 


displays K-M probabilities in the life table 


COMP 
QUANTILES = pf, p2,... displays the survival quantiles corresponding to the specified per- 
centage points. The default quantiles are 0.75, 0.5 and 0.25. 
CONFI=n specifies the level of confidence for K-M probabilities, K-M 
cumulative hazards, K-M log cumulative hazards, mean survival 
time and survival quantiles. The default value is 0.95. 
Example: 


LTAB / SEX=1, X(2)=3 
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NAHAZARD command 


NAHAZARD 


Requests calculation and plotting of N-A cumulative hazards, their standard errors and 
confidence intervals. 


/TLOG log time axis 
LCHAZ log cumulative hazard on y axis 
CONFI-n specifies the level of confidence for N-A cumulative hazards and 


N-A log cumulative hazards. The default value is 0.95. 


ONTL command 


QNTL 


Requests quantiles and their approximate confidence intervals based on the last 
parametric model estimated. 


/ TLOG log time axis 
covar1=d1,covar2=d2,... specifies fixed values on the covariates over which quantiles are 
produced 
Example: 


QNTL / SEX=1, RACE=0 


RELIABILITY command 


RELIABILITY maxtime, (#bins) 


Requests reliabilities and their confidence intervals based on the last parametric model 
estimated. 


/ TLOG log time axis 


covar1=d1, covar2=d2.... specifies fixed values on the covariates over which reliabilities 
are produced 
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HAZARD command 


HAZARD maxtime, (#bins) 


Requests values of the hazard function at specified times, and their approximate 
confidence intervals, based on the last parametric model estimated. 


/ TLOG log time axis 
covar1=d1, covar2=d2... specifies fixed values on the covariates over which hazards are 
produced 
ACT command 


ACT maxtime, (#bins) 
Generates an actuarial life table. 
/ CONDITIONAL conditional life table 


LIFE actuarial life table. Default. 
HAZARD actuarial hazard table 
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TESTAT: 
Test Item Analysis 


TESTAT provides classical analysis and logistic item-response analysis of tests that are 
comprised of responses to each of a set of test items (variables) by each of a set of 
respondents (cases). Classical analysis provides test summary statistics, reliability 
coefficients, standard errors of measurement for selected score intervals, item analysis 
Statistics, and summary statistics for individual cases. Graphical as well as numerical 
displays are provided. 

You also can score individual items for each respondent provided that test items are 
of the “right versus wrong” variety. However, TESTAT is not limited to these kinds of 
data; it will accept and analyze any numerical variables that can be used in SYSTAT. 
Thus, data from true/false tests, multiple-choice tests, rating scales, physiological 
measures, etc. can all be analyzed with TESTAT using the classical test theory model. 

Either a one- or two-parameter logistic model can be selected. Item histograms can 
be printed to examine the fit of each item to the model. TESTAT can save subject scores 
into a SYSTAT file. 


Setup: 
* TESTAT 
* USE 
MODEL 
KEY 
PLENGTH 
SAVE 
HOT * ESTIMATE 
MODEL command 


MODEL varlist 


Specifies the names of the items to be analyzed. 
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PLENGTH command 


TESTAT: Test Item Analysis 


PLENGTH SHORT 
MEDIUM 
LONG 


Controls the amount of output reported. Select one of three categories of output: 


SHORT 


MEDIUM 


LONG 


SAVE command 


For the classical model, this option displays summary statistics for the test 
and a set of reliability (internal consistency) coefficients. For the logistic 
models, the iteration history is displayed indicating the case with the great- 
est change in ability estimate, and items with the greatest change in diffi- 
culty and discrimination estimates at each step. 

For the classical model, this option gives the same output as SHORT. For 
the logistic models, SHORT output plus item difficulties and discrimina- 
tion indices are displayed. 

For the classical model, SHORT output plus individual scores and item his- 
tograms are displayed. For the logistic models, MEDIUM output plus indi- 
vidual estimated item-response abilities and item histograms are displayed. 


SAVE filename 


Saves specified statictics to a file. 
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ESTIMATE command 


ESTIMATE 


Produces statistics for the chosen model. 


| HALF split-half reliabilities. 
CLASSICAL classical model. 
LOG1 one-parameter logistic (Rasch) model. Default. 
LOG2 two-parameter logistic model. 
STEPS=n, number of steps in logistic iterations. The default is10. 
ITER=n, number of logistic iterations. The default is 20. 
CONVERGE=d, sets the convergence criterion. This is the largest relative 
change in any coordinate before iterations terminate. 
The default is 0.05. 
LCONVERGE=a, sets the likelihood convergence criterion. The default is 0.005. 
KEY command 
KEY (values) 


Alters the nature of the data by scoring each item reponse as correct or incorrect or by 
reversing the scoring scale, when it precedes the estimate command. You can check the 
effect of reverse scoring on the offending items by using KEY with the + and — options. 


Example: 


КЕҮ1 11717117111 
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TESTING: 
Hypothesis Testing 


TESTING provides one-sample, two-sample, and multi-sample tests for testing a null 
hypothesis against an alternative hypothesis of the following types: 'not equal’, 'less 
than’, and 'greater than’. Also, confidence intervals (if an alternative hypothesis is 
chosen as one sided then, a one sided confidence bound) are computed. 

You can perform a one-sample z-test when you have a sample from a normal 
distribution with known standard deviation, and a t-test when the standard deviation is 
unknown. Similarly, you can perform two-sample z-test and t-tests. You can perform 
a paired t-test for equality of two means when the observations are paired (and hence 
correlated). You can also perform a test for the mean of a Poisson distribution. 

You can test for a single variance, equality of two variances, and the Bartlett's and 
Levene's tests for equality of several variances. When you have a sample from a 
bivariate normal distribution, you can test for zero correlation and a specified value of 
correlation. You can also test for equality of correlation coefficients. In addition, you 
can test for proportions: single proportion and equality of two proportions. 
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Setup: 
* TESTING 
* USE 

HOT ZTEST varlist = CONSTANT/SD=s one-sample z-test 

HOT ZTEST varlist*grpvar / SD1=s1 SD2=s2 two-sample z-test 

HOT TTEST varlist = CONSTANT one-sample t-test 

HOT TTEST varlist paired t-test 

HOT TTEST varlist*grpvar two-sample t-test 

HOT POISSON varlist - CONSTANT Poisson test 

HOT VARI varlist- CONSTANT test for single variance 

T test for equality of two variances and 

HOT VARI varlist *grpvar Severabviti4tbés 

HOT TCORR varlist test for zero correlation 

HOT TCORR varlist =CONSTANT test for specific correlation 

HOT TCORR var var2 = var3 var4 test for equality of two correlations 

HOT PROP trialvar successvar-P test for single proportion 

HOT PROP trialvarf ERS CNN. test for equality of two proportions 
Tests for Means 
One-sample z-test 
ZTEST command 


ZTEST varlist-CONSTANT / SD=s 


Performs a one-sample z-test for the variables in varlist, for the hypothesized value 


CONSTANT. 

/ SD=s standard deviation (known). 
BONF Bonferroni adjusted probabilities. 
DUNN Dunn-Sidak adjusted probabilities. 


CONFI-n level of confidence for the mean (0 < n < 1). 


ALTER=NE specifies the alternative hypothesis. The default is NE. The other choices are GT: 
ar greater than and LT: less than. 
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TESTING: Hypothesis Testing 


ZTEST MATHMARK SCIMARK = 85 / SD=5 


Two-sample z-test 


ZTEST command 


ZTEST varlist*grpvar / SD1=s1 SD2-s2 


Performs a two-sample z-test for the variables in varlist, where the two samples are 
defined by grpvar. Grouping variable must have only two groups and can either be 
numeric or character. 


/ SD12s1 
SD2=s2 
BONF 
DUNN 
СОМЕІ=л 


ALTER=NE 
LT 
GT 


Example: 


standard deviation (known) for group 1. 

standard deviation (known) for group 2. 

Bonferroni adjusted probabilities. 

Dunn-Sidak adjusted probabilities. 

level of confidence for difference in the means (0 < n < 1). 


specifies the alternative hypothesis. The default is NE. The other choices 
are GT: greater than and LT: less than. 


ZTEST SALBEG SALNOW *SEX / SD1=15 SD2=18 
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One-sample t-test 


TTEST command 


TTEST varlist = CONSTANT 


Performs a one-sample t-test for the variables in varlist, for the hypothesized value 


CONSTANT. 

/ BONF Bonferroni adjusted probabilities. 
DUNN Dunn-Sidak adjusted probabilities. 
CONFI=n level of confidence for the mean (0 < n < 1). 


ALTER=NE specifies the alternative hypothesis. The default is NE. The other choices 
LT аге GT: greater than and LT: less than. 


Example: 


TTEST INCOME - 28500 


Paired t-test 


TTEST command 


TTEST varlist 


Performs a paired comparison t-test on all pairs of variables in varlist. 


/ BONF Bonferroni adjusted probabilities. 
DUNN Dunn-Sidak adjusted probabilities. 
СОМЕІ=п level of confidence for difference in the means (0 < п < 1). 


ALTER=NE specifies the alternative hypothesis. The default is NE. The other choices 
or are GT: greater than and LT: less than. 


Example: 


TTEST BEFORE AFTER 
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Two-sample t-test 


TTEST command 


TESTING: Hypothesis Testing 


TTEST varlist*grpvar 


Performs a two-sample t-test for the variables in varlist, where the two samples are 
defined by grpvar. Grouping variable must have only two groups and can either be 
numeric or character. 


/ BONF 
DUNN 
CONFI=n 


ALTER=NE 
LT 
GT 


Example: 


Bonferroni adjusted probabilities. 
Dunn-Sidak adjusted probabilities. 
level of confidence for difference in the means (0 < n < 1). 


specifies the alternative hypothesis. The default is NE. The other choices 
are GT: greater than and LT: less than. 


TTEST MATH VERBAL GERMAN*SEX$ / BONF 


Poisson test 


POISSON command 


POISSON varlist = CONSTANT 


Performs a Poisson test for the variables in varlist, for the hypothesized value 


CONSTANT. 


/ BONF 
DUNN 
CONFI-n 
ALTER=NE 

LT 


GT 


Bonferroni adjusted probabilities. 

Dunn-Sidak adjusted probabilities. 

level of confidence for the square root of Poisson mean (0 < n < 1). 
specifies the alternative hypothesis. The default is NE. The other choices 
are GT: greater than and LT: less than. 
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Example: 


POISSON DEFECTS = 28 


Tests for Variances 
Single variance 


VARI command 


VARI varlist = CONSTANT 


Performs the test for single variance for the variables in variist, for the hypothesized 
value CONSTANT. 


/ CONFI-n level of confidence for the varaince (0 < n < 1). 


ALTER=NE specifies the alternative hypothesis. The default is NE. The other choices are 


E GT: greater than and LT: less than. 


Example: 


VARI MARKS-76 
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Equality of two variances and several variances 


VARI command 


VARI varlist*grpvar 


Performs the test for equality of two variances for the variables in varlist, where the two 
samples are defined by grpvar. The grouping variable must have only two groups and 
can either be numeric or character. 


1 CONFl=n level of confidence for the ratio of variances (0 < n < 1). 
ALTER=NE specifies the alternative hypothesis. The default is NE. The other choices 
22 are GT: greater than and LT: less than. 


Example: 
VARI MARKS*SECTIONS 
If grpvar has more than two distinct categories, SYSTAT performs Bartlett's and 
Levene’s test. 
Tests for Correlations 


Zero correlation 


TCORR command 


TCORR varlist 


Performs the test for zero correlation coefficient for all the pairs of variables in varlist. 


CONFI-n level of confidence for the correlation coefficient (0 < n < 1). 

ALTER-NE specifies the alternative hypothesis. The default is NE. The other choices are 
LT GT: greater than and LT: less than 
GT 
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Example: 


TCORR EDLEVEL SALNOW 


Specific correlation 


TCORR command 


TCORR varlist =CONSTANT 


Performs the test for specific correlation for all the pairs of variables in varlist for the 
hypothesized value CONSTANT. 


/ CONFl=n level of confidence for the correlation coefficient(0 « n « 1). 


ALTER=NE specifies the alternative hypothesis. The default is NE. The other choices 
e are GT: greater than and LT: less than. 


Example: 


TCORR EDLEVEL SALNOW-0.75 


Equality of two correlations 


TCORR command 


TCORR var? var2 = var3 var4 


Performs the test for equality of two correlation for the pairs of variables (vart, var2) 
and (var3, var4). 


/ ALTER=NE specifies the alternative hypothesis. The default is NE. 
The other choices are GT: greater than and LT: less than. 


Example: 


TCORR RUHEALTH RUEDUC = CTHEALTH CTEDUC 
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Tests for Proportions 
Single proportion 


PROP command 


PROP trialvar successvar = P (or) 
PROP N X=P 


Performs the test for single proportion for the input trialvar (N) and successvar (X) for 
the hypothesized value 'Р'. 


| CONFl=n level of confidence for the proportion (0 < n < 1). 


ALTER-NE specifies the alternative hypothesis. The default is NE. The other choices are 
T GT: greater than and LT: less than. 


Examples: 


PROP 100 29 = 0.4 


PROP TRIALS FAILURE = 0.4 
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Equality of two proportions 


PROP command 


PROP trialvar1 successvar1 = trialvar2 successvar2 (or) 
PROP N1 X1 = № X2 


Performs the test for equality of proportions for the input trialvar1 (N1), successvar1 
(X1) and trialvar2 (N2), successvar2 (X2). 


/CONFl=n level of confidence for the difference in proportions (0 « n « 1). 


ALTER=NE specifies the alternative hypothesis. The default is NE. The other choices are 
es GT: greater than and LT: less than. 


Examples: 


PROP 100 29 - 50 19 


PROP SHIFT1 DEFECTS! = SHIFT2 DEFECTS2 
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TREES: 
Classification and Regression Trees 


The TREES module computes classification and regression trees. Classification trees 
include those models in which the dependent variable (the predicted variable) is 
categorical. Regression trees include those in which it is continuous. Within these types 
of trees, the TREES module can use categorical or continuous predictors, depending on 
whether a CATEGORY statement includes some or all of the predictors. 

For any of the models, a variety of loss functions are available. Each loss function 
is expressed in terms of a goodness-of-fit statistic, proportion of reduction in error 
(PRE). For regression trees, this statistic is equivalent to the multiple R?. Other loss 
functions include a Gini-type index, twoing, and phi-square. 

TREES produces graphical trees called mobiles. At each branch is a density (box 
plot, dot plot, histogram, etc.) showing the distribution of observations at that point. 
The branches balance at each node so that the branch is level, given the number of 
observations at each end. The physical analogy is most obvious for dot plots, where the 
stacks of dots (one for each observation) balance like marbles in bins. 

TREES also produces, optionally, a SYSTAT program to code new observations 
and predict the dependent variable. This program can be saved to a file and run from 
the command window or submitted as a program file. 


Setup: 
* TREES 
1 USE 
^ MODEL 
PLENGTH 
HOT * ESTIMATE 
MODEL command 


MODEL depvar = indvarlist 
Specifies the names of the items to be analyzed. 


/ EXPAND adds to the model all possible sums and differences of pairs of predictors. 
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PLENGTH command 


PLENGTH SHORT 
LONG 


To request extended results,enter PLENGTH prior to ESTIMATE. The default output 


includes splitting history and summary statistics. PLENGTH LONG adds a SYSTAT 
program for classifying new observation. 


ESTIMATE command 


ESTIMATE 


Initiates estimation. 


/ LOSS=LSQ least-squares loss (the AID model). Default. 

TRIM trimmed mean loss. 

LAD least absolute deviations loss. 

PHI phi-coefficient loss. 

GINI Gini index loss. 

TWO twoing loss. 
NSPLIT=n maximum number of splits. The default is 10. 
PMIN=d minimum proportion reduction in error for tree allowed at any split. 

The default is 0.05. 
SMIN=d minimum split value allowed at any node. The default is 0.05. 
NMIN=n minimum count allowed at any node. The default is 5. 
DENSITY=BOX type of density display. The default is box plot. 
DIT dot histogram. 


DOT dot plot. 
JITTER jitter density. 
STRIPE stripe density. 
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TSLS: 
Two-Stage Least Squares 


The TSLS module is designed for estimation of simultaneous equations systems via 
Two-Stage Least Squares (TSLS) and Two-Stage Instrumental Variables (TSIV). TSLS 
produces heteroskedasticity-consistent standard errors for ordinary least-squares 
(OLS) models and instrumental variables models, and provides diagnostic tests for 
heteroskedasticity and nonlinearity. TSLS also computes regressions with 
polynomially distributed lag structure in the errors. 


Setup: 
* TSLS 
* USE 
* MODEL 
SAVE 
HOT * ESTIMATE 
HYPOTHESIS 
CONSTRAIN 
HOT * TEST 


MODEL command 


MODEL depvar = CONSTANT + var + var*var +... | instrumvariist 


Specifies a model to be estimated. Lagged variables must be followed by a colon and 
the number of lags. Also degree of the polynomial should be followed in the 
paranthesis. 


Examples: 
MODEL M2 = CONSTANT + GDP | CONSTANT + GPDI + FEDEXP *TB6 
MODEL M2 = CONSTANT + GDP:2 | CONSTANT + GPDI * FEDEXP *TB6 


MODEL M2 = CONSTANT + GDP:2(1) | CONSTANT * GPDI * FEDEXP *TB6 
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SAVE command 


SAVE filename 


Saves predicted values and residulas to a filename. In case of BY groups, the desired 
statistics for each BY group is saved in the same file. 


ESTIMATE command 


ESTIMATE 
Fits the previously specified model. 


/HC computes heteroskedasticity-consistent standard errors. 


HYPOTHESIS command 


HYPOTHESIS 
Initiates a block of CONSTRAIN statements for specifying and testing a hypothesis. 


CONSTRAIN command 


CONSTRAIN argument 


Defines the restrictions being tested. These can include any linear algebraic expression 
without parentheses involving the parameters. If interactions were present on the 
MODEL statement, they can also appear on the CONSTRAIN statement. 


Example: 


CONSTRAIN SALES = 0 
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TEST command 


TEST 


Initiates testing of the hypothesis. 
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VC: 
Variance Components 


Variance Components (VC) can carry out estimation and hypothesis tests in a variance 
components model for both balanced and unbalanced data. A variance components 
model can have any number of fixed and/or random effects, including interactions 
(crossed effects) and nestings (nested effects.) Both categorical and continuous 
variables are allowed as predictor variables. Thus VC can be used to fit mixed 
regression as well as mixed ANOVA models. The models handled by VC constitute a 
subclass of those handled by MIXED, which allows more general covariance structures 
for the random effects and the random error. The subclass of models dealt with by VC 
is arguably the most frequently used type of linear mixed models. 


Setup: 
* VC * HYPOTHESIS 
* USE FMATRIX [matrix] 
RESET RMATRIX [matrix] 
* — MODEL DMATRIX [matrix] 
CATEGORY PAIRWISE 
PLENGTH HOT * TEST 
RANDOM 
SAVE 
HOT * ESTIMATE 
Model Building 
RESET command 
Clears all VC command specifications from memory, returning all commands to their 
default state. 
MODEL command 


MODEL var1 = INTERCEPT + varlist + var2*var3 + var4(var5) 
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Specifies the linear model to estimate. varf is the dependent variable. Specify the 
model with intercept by INTERCEPT or CONSTANT. Specify interactions by linking 
variables with an asterisk. The parentheses denote nested factors. 


Examples: 


MODEL Y = INTERCEPT + A + X*X + B(A) 
MODEL Y = GENDER + GENDER*COUNTRY$ 


MODEL Y = A*B*C 


CATEGORY command 


CATEGORY grpvariist 
Specifies numeric or string grouping variables that define cells. 


/ MISS allows cases with a missing value for the categorical 
variable to be included in the analysis. 


PLENGTH command 


PLENGTH SHORT 
MEDIUM 
LONG 


VC produces extended output if you set the output length to LONG. For model 
estimation, extended output adds BLUES if fixed effects and BLUPs of random effects. 


RANDOM command 


RANDOM varlist 
RANDOM var1 
RANDOM var2*var3 


RANDOM var4(var5) 
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Specifies the random effects of the model. You can use multiple RANDOM statements 
if you want to specify more than one random effect. 


Examples: 
RANDOM INTERCEPT + A+B 


RANDOM INTERCEPT + A 


RANDOM AB 
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SAVE command 


SAVE filename 


Before ESTIMATE. Saves specified statistics to a file. 


1 MRESIDUALS marginal residuals 
CRESIDUALS conditional residuals 
DATA data from original file along with marginal and conditional 
residuals 
VARCOMPONENTS estimates of variance components 
FIXED Best Linear Unbiased Estimates (BLUE) of fixed effects 
RANDOM Best Linear Unbiased Predictors (BLUP) of random effects 
SERRORFIXED standard errors of fixed effects estimates 
MODEL marginal and conditional residuals, response variable, and the design 
matrices 
Examples: 
SAVE MYFILE / VARCOMP 


SAVE MYFILE / MRESIDUALS 
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ESTIMATE command 


ESTIMATE 


Tells SYSTAT to estimate the analysis specified in MODEL. 


/ METHOD = ML 
REML 
TYPE1 


TYPE2 


TYPE3 
MIVQUEO 


TYPE = HESSIAN 
LIKELIHOOD 
PARAMETERS 

CRITERION = RELATIVE 

ABSOLUTE 

NEM = n1 

NNR = n2 


CONVERGENCE = a7 


HALF = n3 
TOLERANCE = a2 
CONFIDENCE = d4 


GSTART = [07, g2, ..., gk] 


estimation by Maximum Likelihood (ML) 

estimation by Restricted Maximum Likelihood (REML) 
estimation by ANOVA with TYPE | sum of squares 
estimation byANOVA with TYPE II sum of squares 
estimation byANOVA with TYPE III sum of squares 
estimation byMinimum Variance Quadratic Unbiased 
Estimation 


specifies the type of convergence to be checked in order 
to stop iterations. 


specifies the convergence criterion to be used. 


specifies the maximum number of EM iterations to be 
performed before continuing to Newton-Raphson 
iterations. 

specifies the maximum number of Newton-Raphson 
iterations. 

sets the cutoff value for convergence criterion. Iterations 
stop when convergence criterion is less than this cutoff 
value. The default is 1е-8. 

specifies the maximum number of step-halvings in 
Newton-Raphson iteration, 

sets the tolerance for double precision computations. 
The default is 1e-12. 

sets the value for confidence interval for the estimated 
parameters. The default is 0.95. 

specifies the vector of parameters to be used as initial 
estimates of random effects covariance parameters. 
Specify error variance as a last parameter in GSTART. 
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Hypothesis testing 


HYPOTHESIS command 


HYPOTHESIS 


Tells SYSTAT that you want to test hypotheses on a previous MODEL. You must enter 
HYPOTHESIS before you can use any of the following commands. 


PAIRWISE command 


PAIRWISE varlist 


This performs pairwise equality checks among the coefficients of the specified fixed 


effect. 

BONF Bonferroni comparisons. This is the default method. 

LSD LSD pairwise comparisons. 

TUKEY uses the Studentized range statistic to make all pairwise comparisons. 
SCHEFFE Scheffe pairwise comparisons. 

SIDAK Student's t statistic for pairwise multiple comparisons. 

GT2 uses Studentized maximum modulus distribution 

Example: 

PAIRWISE SEASON / BONF 


PAIRWISE SEASON 
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FMATRIX command 


FMATRIX [matrix] 


FMATRIX is matrix of linear weights contrasting the coefficient estimates for fixed 
effects. Specify as many numbers as dimension of your beta vector. 


Example: 


FMATRIX [1 -1 0 0] 


RMATRIX command 


RMATRIX [matrix] 


RMATRIX is matrix of linear weights contrasting the coefficient estimates for random 
effects. Specify as many numbers as dimension of your gamma vector. 


Example: 


RMATRIX [2 1 -1] 


DMATRIX command 


DMATRIX [matrix] 
D is a null hypothesis vector. By default it is a null vector. 
Example: 


DMATRIX [10 15] 
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TEST command 


TEST 
Initiates the test of your hypothesis. 
/ СОМЕ = п specifies the level of confidence. 


ESTIMATE provides estimate of the estimable linear parametric function, its 
standard error and corresponding t-test. 
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XTAB: 
Crosstabulations with Tests and Measures 


XTAB can make, analyze, and save frequency tables that are formed by one, two, or 
more categorical variables. The values of the table factors can be character or numeric. 
XTAB offers counts, percents (of row totals, column totals, or the total count), and 
cumulative percents. XTAB forms tables using data read from a cases-by-variables 
rectangular file or data recorded as frequencies with cell indices. 

For two-way tables and multiway table: standardize, XTAB provides 21 tests of 
significance or measures of association. Each is appropriate for a particular table 
structure (rows by columns). In addition, for some measures, the categories are 
assumed to be ordered (for example, low, medium, and high). The following are the 
measures available for each table structure: 


2 x 2 Tables 2 x k Tables 

Pearson chi-square ordered categories 
Likelihood ratio chi-square Cochran's test of linear trend 
Yates’ corrected chi-square 

Fisher’s exact test Rx R Tables 

Odds ratio McNemar’s chi-square 
Yule's Q and Y Cohen's kappa 

R x C Tables 

unordered categories ordered categories 

Pearson chi-square Kendall’s tau-b 

Likelihood ratio chi- Stuart’s tau-c 

square 

Phi Goodman-Kruskal’s gamma 
Cramer’s V Spearman’s rho 
Contingency Somers’ d 

Uncertainty 

Goodman-Kruskal’s 

lambda 


For one-way tables with binomially or multinomially distributed data, confidence 
intervals on the cell proportions are available. XTAB also has the Mantel-Haenszel 
Statistic for testing the association between two binary variables controlling for a 
stratification variable. 
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Many other nonparametric statistics are computed elsewhere in SYSTAT. For example, 
CORR calculates matrices of coefficients like Spearman’s rho, Kendall’s tau-b, 
Guttman’s mu2, and Goodman-Kruskal’s gamma, and tetrachoric correlation. 


Setup: 
* XTAB 
+. USE. 
PLENGTH 
SAVE 
HOT *  TABULATE 
HOT * STD 
PLENGTH command 
PLENGTH NONE 
SHORT 
MEDIUM 
LONG 


Each of four categories of outputs has statistics or features associated with it, described 
below. 


None: The argument NONE will not give any result. 


/ FREQ Observed frequency for all table structures 
les: tabulate and multiway: stan- 


PERCENT  Percents for one-way, two-way, multiway tab 
dardize tables structures 


LIST List layout for one-way, two-way, and multiway: tabulate table structures 

ROWP Row percents for two-way, multiway tables: tabulate and multiway tables: 
standardize 

COLP Column percents for two-way, multiway tables: tabulate and multiway tables: 
standardize 


EXPECT Expected frequency for two-way tables and multiway tables: standardize 
CHISQ Chi-square for one-way, two-way tables and multiway tables: standardize 
STAND Standardized deviates for two-way tables and multiway tables: standardize 
FISHER Fisher’s exact tests for two-way tables and multiway tables: standardize 
LRCHI Likelihood ratio chi-square test for two-way tables and 

multiway tables: standardize 
YATES Yates corrected chi-square for two-way tables and multiway tables: standardize 
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XTAB: Crosstabulations with Tests and Measures 


ODDS 
YULE 


COCHRAN 
MCNEM 


KAPPA 
PHI 
CRAMER 
CONT 
UNCE 
LAMBDA 


RHO 
GAMMA 


TAUB 
TAUC 
SOMERS 
MANTEL 
DEVI 
TCP 


Odds ratio for two-way tables and multiway tables: standardize 

Yules Q and Y for two-way tables and multiway tables: standardize 
Cochran’s test of linear trend for two-way tables and multiway tables: stan- 
dardize 

McNemar’s test for symmetry for two-way tables and multiway tables: stan- 
dardize 

Cohen’s kappa for two-way tables and multiway tables: standardize 

Phi for two-way tables and multiway tables: standardize 

Cramer’s V for two-way tables and multiway tables: standardize 
Contingency coefficient for two-way tables and multiway tables: standardize 
Uncertainty coefficient for two-way tables and multiway tables: standardize 
Goodman-Kruskal's lambda for two-way tables and multiway tables: standard- 
ize 

Spearman’s Rho for two-way tables and multiway tables: standardize 
Goodman-Kruskal’s gamma for two-way tables and multiway tables: standard- 
ize 

Kendall’s tau-b for two-way tables and multiway tables: standardize 
Kendall's tau-c for two-way tables and multiway tables: standardize 

Somer’s d for two-way tables and multiway tables: standardize 
Mantel-Haenszel test for 2x2 sub-table of multiway table: tabulate 

Deviates for two-way tables and multiway tables: standardize 


Table of counts and percents for one-way, two-way, multiway tables: 
tabulate and multiway tables: standardize 


Short: The argument SHORT requests the following features: 


LIST 


CHISQ 
FREQ 


PERCENT 


alternate layout, with counts and cumulative counts, and percents and 
cumulative percents as part of the display for One-way tables 


Pearson chi-square for one-way, two-way and multiway tables: standardize 


frequency table for two-way, multiway tables: tabulate and multiway tables: 
standardize 


replaces each cell frequency with its percent of the total table count for 
two-way and multiway tables: tabulate 
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Medium. Requests the output for SHORT above, plus the following features: 


LIST 


FREQ 
PERCENT 


STAND 


2x2 tables 
LRCHI 
YATES 
FISHER 
ODDS 
YULE 

2xktables 
COCHRAN 

R xR tables 
MCNEM 
KAPPA 


alternate layout, with counts and cumulative counts, and percents and 
cumulative percents as part of the display for two-way and multiway 
tables: tabulate 


frequency table for one-way table 


replaces each cell frequency with its percent of the total table count for 
one-way table and multiway tables: s i 


standardized deviates (observed - expected) / SQR(expected) for each 
two-way tables and multiway tables: standardize 


likelihood ratio chi-square 

Yates’ corrected chi-square 

Fisher’s exact test 

odds ratio (cross-product ratio) and In(odds) with its standard error 
Yule’s О and Y. 


Cochran’s test of linear trend on proportions. 


McNemar’s test for symmetry 
Cohen’s kappa measure of agreement 


R x C tables, unordered levels 


PHI 
CRAMER 
CONT 
UNCE 
LAMBDA 


phi, a function of chi-square that does not increase with n. 
Cramer's V 

Contingency coefficient с 

uncertainty coefficient 

Goodman and Kruskal’s lambda 


R x tables, ordered levels 


RHO 
GAMMA 
TAUB 
TAUC 
SOMERS 

k x 2 x 2 tables 
MANTEL 


Spearman’s rho or rank correlation 
Goodman-Kruskal's gamma 
Kendall's tau-b 

Stuart's tau-C 

Somers’ d 


Mantel-Haenszel for a combined estimate of the odds ratio. (specify a 
three-way table) 
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Long. Requests the output for SHORT and MEDIUM above, plus the following features: 


TCP table of counts and percents for one-way, two-way, and multiway tables: 
tabulate structures 

EXPECT expected values for each two-way table cell and multiway tables: standardize 

DEVI deviates (observed values — expected values) for each two-way table cell and 


multiway tables: standardize 


ROWPCT replaces each cell frequency with its percent of the row total (two-way, mul- 
tiway tables: tabulate and multiway tables: standardize) 


COLPCT replaces each cell frequency with its percent of the column total 
(two-way, multiway tables: tabulate and multiway tables: standardize) 


SAVE command 


SAVE filename 

Saves the frequency counts for each cell to a file. Specify SAVE before TABULATE. 
For two-way tables and multiway tables: standardize, the following are available: 

/ TABLES saves expected values, deviates, and standardized deviates. This is the 


default. 
MEASURES saves the table measures for 2-way tables. 
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TABULATE command 


XTAB: Crosstabulations with Tests and Measures 


TABULATE varlist 


TABULATE varlist*var1*var2*... 


Produces frequency tables for each variable in varlist. If you omit varlist, all variables 
are tabulated. To produce two-way or multiway tables: tabulate, separate variables with 


asterisks (*). 


1 CONFI=n 


MISS 
SHADE=threshold 
MINIMUM= varname 
MAXIMUM- varname 
SUM=varname 


MEAN- varname 
SD- varname 


RANGE= varname 
VARIANCE= varname 


STD command 


for a one-way table, display confidence intervals on cell proportions 
at level n (0 <= n <= 1). For two-way tables display confidence 
intervals for Odds ratio, Yule's Q and Y, Uncertainty, Goodman- 
Kruskal's lambda, Cohen's kappa, Spearman's rho, Goodman- 
Kruskal's gamma, Kendall's tau-b, Stuart's tau-c, and Somers' d. The 
default value is 0.95. 
when a table factor has missing category codes, include a category 
for "missing". 

shades the cell values based on the standardized residual values. The 
default value is 4. 

gives the minimum value of variable name for each cell in the 
frequency table. 

gives the maximum value of variable name for each cell in the 
frequency table. 

gives the sum of the variable name for each cell in the frequency 
table. 

gives mean of variable name for each cell in the frequency table. 
gives the standard deviation of variable name for each cell in the 
frequency table. 
gives the range of variable name for each cell in the frequency table. 
gives the variance of variable name for each cell in the frequency 
table. 


STD varlist 
STD varlist*var1*var2"... 


Produces two-way standardized frequency tables for two-way conditional tables of 
multiway tables. 


/ CONFl=n For multiway tables: standardize, display confidence intervals for 
Odds ratio, Yule's Q and Y, Uncertainty, Goodman-Kruskal's 
lambda, Cohen's kappa, Spearman's rho, Goodman-Kruskal's 
gamma, Kendall's tau-b, Stuart's tau-c, and Somers’ d. The default 
value is 0.95. (at level n (0 <= п <= 1)) 


SHADE=threshold shades the cell values based on the standardized residual values. The 
default value is 4. 


Acronym & Abbreviation 


A 

ABS - absolute value 

ACF - autocorrelation function 

ACOLOR - color axes 

ACS - arccosine 

ACT - actuarial life table 

AD test - Anderson Darling test 
ADDTREE - additive trees 

ADFG - asymptotically distribution free 
estimate biased, Gramian 

ADFU - asymptotically distribution free 
estimate unbiased 

ADJSEASON - seasonal adjustment 
AHMAX - maximum extent 

AHMIN - minimum extent 

AIC - Akaike information criterion 

AID - automatic interaction detection 

ALT - alternative 

ANCOVA - analysis of covariance 

ANGI - deviation of angles from north in a 
clockwise direction 

ANG? - deviation of angles from horizontal 
(for 3D models) 

ANG3 - tilt angle 

ANOVA - analysis of variance 
ANOVAHYPO - hypothesis tests in analysis 
of variance 

AR - autoregressive 

ARIMA - autoregressive integrated moving 
average 

ARL - average run length 

ARMA - autoregressive moving average 


Expansions 


ARS - adaptive rejection sampling 
ASCII - American Standard Code for 
Information Interchange 

ASE - asymptotic standard error 
ASN - arcsine 

ATH - arc hyperbolic tangent 

ATN - arctangent 

AVERT - vertical extent 

AVG - average 


B 

BC - Bray-Curtis similarity measure 
BCa - Bias Corrected and accelerated 
BCF - Beta cumulative function 
BDF - Beta density function 
BETACORR - beta correction 

BIC - Bayesian information criterion 
BIF - Beta inverse function 

BMP - Windows bitmap 

BOF - beginning-of-file 

BOG - beginning-of-BY group 
ВОМЕ - Bonferroni 

BOOT - bootstrap 

BRN - Beta random number 


C 
C&RT - classification and regression trees 


CBSTAT - column basic statistics 

CCF - Cauchy cumulative function 

CCF - cross-correlation function 

CDF - Cauchy density function 

cdf/CF - cumulative distribution function 
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Acronyms 


CDFUNC - coefficients for canonical variables 
CFUNC - coefficients for the classification 
functions 

CGM - Computer graphics metafile: binary or 
clear text 

CHAZ - cumulative hazard 

CHISQ - Chi-square distribution 

CHOL - Cholesky decomposition 

CI - confidence interval 

CIF - Cauchy inverse function 

CIM - confidence interval of mean 
CLASS - classification 

CLSTEM - stem and leaf plot for column 
CMeans - canonical scores of group means 
CMULTIVAR - multiple string variables 
COEF - coefficients 

COL/col - column 

COLPCT - Column percentages 

CONFIG - configuration 

CONT - Contingency coefficient 

CONV - convergence 

CORAN - correspondence analysis 

CORR - correlations 

CORRI - single correlation coefficient 
CORRZ2 - equality of two correlations 
COV - covariance 

Cp - process capability index 

CPL - process capability based on lower 
specification limit 

CPU - process capability based on upper 
specification limit 

Cpk-Process capability index for off-centered 
process 

CR - confidence region 

CRA - cost of response above UTL 

CRB - cost of response below LTL 

CRN - Cauchy random number 
CSCORE - canonical scores 


CSIZE - size of characters 

CSQ - Chi-square 

CSTATISTICS - column statistics 
CSV - comma separated values 
CUSUM - cumulative sum 

CUSUM HI - Upper cumulative sum 
CUSUM LO - Lower cumulative sum 
CV - coefficient of variation 

CVI - cross validation index 


D 

DBF - Dbase files 

DC - deciles of risk 

DECF - Double exponential cumulative 
function 

DEDF - Double exponential density function 
DEIF - Double exponential inverse function 
DENFUN - density function 

dep. - dependent 

DERN - Double exponential random number 
DET - determinant 

DEVI - deviates (observed values - expected 
values) 

DEXP - Double exponential distribution 

df - degrees of freedom 

DF - distribution function 

DHAT - estimated distance 

DIF - data interchange format 

DIM - dimension 

DISCRIM - discriminant analysis 

DIST - distance 

DIT - dot histogram 

DOE - design of experiments 

DOS - disc operating system 

DPMO - defects per million opportunities 
DPU - defects per unit 

DTA - Stata files 

DUCF - Discrete uniform cumulative function 
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DUDF - Discrete uniform density function 
DUIF - Discrete uniform inverse function 
DUNIFORM - Discrete uniform 

DURN - Discrete uniform random number 
DWLS - distance weighted least-squares 


E 

ECF - Exponential cumulative function 
EDF - Exponential density function 
EEXP - extreme value exponential 

EIF - Exponential inverse function 
EIGEN - eigenvalues 

ELAMBDA - exp(lambda) 

EM - expectation-maximization 

EMF - Windows enhanced metafile 
ENCF - Logit normal cumulative function 
ENDF - Logit normal density function 
ENIF - Logit normal inverse function 
ENORMAL - Logit normal 

ENRN - Logit normal random number 
EOF - end-of-file 

EOG - end-of-BY group 

EPS - Encapsulated postscript 

ERN - Exponential random number 

ES - exhaustive search 

ESS - error sum of squares 

EW - extreme value Weibull 

EWMA - exponentially weighted moving 
average 

EXP/exp - exponential/ expected 


F 

FAR - false-alarm rates 

FCF - F cumulative function 
FCOLOR - color foreground 

FDF - F density function 

FIF - F inverse function 

FINV - inverse of the F cumulative 


Acronyms 


FITC - fitting distribution: continuous 
FITD - fitting distribution: discrete 
FITDIST - fitting distributions 
Flexibeta - flexible beta 

FPLOT - function plots 

FRN - F random number 

FTD - folded trellis detector 

FTDEV - Freeman-Tukey deviate 
FULLCOND - full conditional 

FUN - function 


G 

GCF - Gamma cumulative function 
GCOR - groupwise correlation matrix 
GCOV - groupwise covariance matrix 
GCV - generalized cross validation 
GDF - Gamma density function 

GECF - Geometric cumulative function 
GEDF - Geometric density function 
GEIF - Geometric inverse function 
GEN - general Toeplitz structure 
GERN - Geometric random number 
GG - Greenhouse Geisser 

GIF - Gamma inverse function 

GIF - Graphics Interchange Format 
GLM - generalized linear models 
GLMHYPO - hypothesis tests in general linear 
model 

GLMPOST - post hoc estimate for repeated 
measures in general linear model 

GLS - generalized least-squares 

GMA - geometric moving average 

GN - Gauss-Newton method 

GOCF - Gompertz cumulative function 
GODF - Gompertz density function 
GOIF - Gompertz inverse function 
GORN - Gompertz random number 
GRN - Gamma random number 
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Acronyms 


GUCF - Gumbell cumulative function 
GUDF - Gumbell density function 
GUIF - Gumbell inverse function 
GURN - Gumbell random number 


H 

Н & L - Hosmer and Lemeshow 

HC - heteroscedasticity-consistent 

HCF - Hypergeometric cumulative function 
HDF - Hypergeometric density function 
HF- Huynh-Feldt 

HGEOMETRIC - hypergeometric 

HIF - Hypergeometric inverse function 
HIST - histogram 

HKB - Hoerl, Kennard, and Baldwin 
H-L trace - Holding-Lawley trace 

HR - hit-rates 

HRN - Hypergeometric random number 
HSD - honestly significant differences 
HTERM - terms tested hierarchically 
HTML - hyper text markup language 
HYMH - hybrid Metropolis-Hastings 


I 

IF - Inverse cumulative distribution function 
IGAUSSIAN - inverse Gaussian 

IGCF - Inverse Gaussian cumulative function 
IGDF - Inverse Gaussian density function 
IGIF - Inverse Gaussian inverse function 
IGRN - Inverse Gaussian random number 
IIDMC - independently and identically 
distributed Monte Carlo 

IMPSAMPI - importance sampling integration 
IMPSAMPR - importance sampling ratio 
I-MR - individual and moving range 
Ind/indep - independent 

IndMH - Independent Metropolis-Hastings 


INDSCAL - individual differences scaling 
INITSAMP - initial sample 

INTEG FUN - integrated function 

IPA - iterated principal axis 

ITER - iterations 


J 

JACK - jackknife 

JCLASS - jackknifed classification 

JMP - JMP v3.2 data files 

JPEG/JPG - joint photographic experts group 


K 

K-M - Kaplan-Meier 

KNBD - kth nearest neighborhood 

KRON - Kronecker product 

K-S test - Kolmogorov-Smirnov test 

KS1 - one sample Kolmogorov-Smirnov tests 
KS2 - two sample Kolmogorov-Smirnov tests 


L 

LAD - least absolute deviations 

LB - larger the better 

LCF - Logistic cumulative function 

LCHAZ - log cumulative hazard 

LCL - lower control limit 

LCONV - log-likelihood convergence criteria 
LDF - Logistic density function 

LGM - log gamma 

LGST - logistic 

LIF - Logistic inverse function 

L-L/LL - log likelihood 

LMS- least median of squares 

LMSREG - least median of. squares regression 
LNCF - Lognormal cumulative function 
LNDF - Lognormal density function 

LNIF - Lognormal inverse function 
LNOR/LNORMAL - lognormal 
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LNRN - Lognormal random number 
loc - location 

LOGI - one-parameter logistic (Rasch) 
LOG2 - two-parameter logistic 

LOGIT - logistic regression 
LOGITHYPO - hypothesis tests in logistic 
regression 

LOGLIN - loglinear modeling 

LR - likelihood ratio 

LRCHI - likelihood ratio chi-square 
LRDEV - likelihood ratio of deviate 
LRN - Logistic random number 

LS - least-squares 

LSD - least significant difference 

LSL - lower specification limit 

LSQ - least-squares 

LTAB - life tables 

LTL - lower tolerance limit 

LW - Lawless and Wang 


M 

MA - moving average 

MAD - mean absolute deviation 
MAHAL - Mahalanobis distances 
MANCOVA - multivariate analysis of 
covariance 

MANOVA - multivariate analysis of variance 
MANOVAHYPO - hypothesis tests in 
MANOVA 

MANOVAPOST - post hoc estimate for 
repeated measures in MANOVA 

MAR - missing at random 

MAX - maximum 

MAXSTEP - maximum number of steps 
MCAR - missing completely at random 
MCMC - Markov Chain Monte Carlo 
MDPREF - multidimensional preference 
MDS - multidimensional scaling 


Acronyms 


MIN - minimum 

M-H- Metropolis-Hastings 

MIS - number of missing values 

MIX - mixed regression 

MIXHIER - mixed regression for data having a 
hierarchical structure 

MIXMULTY - mixed regression for data 
having a multivariate structure 

ML - Maximum Likelihood 

MLA - maximum likelihood analysis 
MLE - maximum likelihood estimate 
MML - maximum marginal likelihood 
MRC - Multiple Regression and Correlation 
MS - mean squares 

MSE - mean square error 

MSIGMA - sigma measurement 

MT - Mersenne-Twister 

MTW - MINITAB v11 data files 

MU2 - Guttman's mu2 monotonicity 
coefficients 

MULTIVAR - multiple variables 

MW - minimum within sum of squares 
deviations 

MWL - maximum Wishart likelihood 


N 

NAR - non-stationary first-order autoregressive 
NB - nominal the best 

NBB - nominal-the-best: bilateral tolerance 
NBCF - Negative binomial cumulative function 
NBD - number of active bounds on parameter 
values 

NBDF - Negative binomial density function 
NBIF - Negative binomial inverse function 
NBINOMIAL - Negative binomial 

NBRN - Negative binomial random number 
NBU - nominal-the-best: unilateral tolerance 
NCAT - number of categories 
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Acronyms 


NCF - Binomial cumulative function 
NCOL - number of columns 

NDF - Binomial density function 

NDMAX - maximum number of points 
NDMIN - minimum number of points 
NEM - number of EM iterations 

NEXPO - negative exponential 

NIF - Binomial inverse function 

NIPALS - Nonlinear iterative partial least 
Squares 

NLAG - number of lags 

NLLOSS - nonlinear loss functions 
NLMODEL - nonlinear models 

NMIN - minimum count 

NMULTIVAR - multiple numeric variables 
NONLIN - nonlinear models 

NP-Number nonconforming 

NPAR - nonparametric 

NREC - non-recreationist 

NRN - Binomial random number 

NROW - number of rows 

NRP - number of apparently redundant 
parameters 

NSAMP - number of sub-samples 

NSPLIT - maximum number of splits 

NX - number of nodes along the x axis 
NXDIS - number of discretization points in the 
x (North) direction 

NY - number of nodes along the y axis 
NYDIS - number of discretization points in the 
y (East) direction 

NZ - number of nodes along the z axis 
NZDIS - number of discretization points in the 
z (Depth) direction 


О 
Obs-observed 
OBSFREQ - observed frequency 


OC - operating characteristic 

ODBC - open database capture and 
connectivity 

OFREQ - outlier frequencies 

OLS - ordinary least-squares 
ORTHEQ-Equally Spaced Orthogonal 
component 

ORTHUN- Unequally Spaced Orthogonal 
component 


P 

P - Proportion nonconforming 

PACF - Pareto cumulative function 
PACF - partial autocorrelation function 
PADF - Pareto density function 

PAIF - Pareto inverse function 
PARAM - parameters 

PARN - Pareto random number 

PCA - process capability analysis 
PCF - iterated principal axis factoring 
PCF - Poisson cumulative function 
PCNTCHANGE - percentage change 
PCT - Macintosh PICT 

PDF - Poisson density function 

pdf - probability density function 
PDL - polynomial distributed lag 
PERMAP - perceptual mapping 

PIF - Poisson inverse function 
PLIMITS - probability limits 

PLS - partial least squres 

pmf - probability mass function 
PMIN - minimum proportion 

PNG - Portable Network Graphics 
POLY - polygon 

POSAC - partially ordered scalogram analysis 
with coordinates 

P-P - probability plot 

PP - process performance 
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Ppk - Process performance index for off- 
centered process 

PPL - process performance based on lower 
specification limit 

PPM - parts per million 

PPU - process performance based on upper 
specification limit 

PRE - percentage reduction error 
PREFMAP - preference mapping 

PRN - Poisson random number 

PROB - probability 

PROP! - single proportion 

PROP2 - equality of two proportions 

PS - PostScript 

PVAF/p.v.a.f. -- present value annuity factor 
p-value - probability value 


Q 

QC - quality control 

QMLE - quasi maximum likelihood estimate 
QNTL - quantiles 

QPLOT - quantile plots 

Q-QPLOT - two sample quantile plot 
QRD - QR decomposition 

QS - quick search 

QSK - quantitative symmetric similarity 
coefficients (or Kulezynski measure) 
QUASI - Quasi-Newton method 


R 

R & R - repeatability and reproducibility 

R chart - range chart 

RADMAX - maximum horizontal direction for 
the search radius 

RADMIN - minimum horizontal direction for 
the search radius 

RAMONA - Reticular Action Model or Near 
Approximation 


Acronyms 


RAND - random 

RANDSAMP - random sampling 
RANKREG - rank regression 
RBSTAT - row basic statistics 

ЕСЕ - Rayleigh cumulative function 
RDF - Rayleigh density function 
RDISCRIM - robust discriminant 
RDIST - robust distance 

RDVER - vertical direction for the search 
radius 

REPAR - reparametrize 

REPS - replicates 

RESID - residuals 

RIF - Rayleigh inverse function 

RJS - rejection sampling 

RMS - root mean square 

RMSEA - root mean square error of 
approximation 

RMSSTD - root mean square standard 
deviation 

ROC - receiver operating characteristic 
ROWPCT - Row percentages 

RRN - Rayleigh random number 

RS - response surface 

RSE- robust standard errors 

RSEED - random seed 

RSM- response surface methods 

RSQ - stress and squared correlation 
RSS - residual sum of squares 
RSTATISTICS - row statistics 

КТЕ - rich text format 

RWM-H - random walk Metropolis-Hastings 
RWSTEM - stem and leaf plot for rows 


S 

S chart - standard deviation control chart 
SANGI - angle (in degrees) of the first minor 
axis of the search ellipsoid 
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Acronyms 


SANG2 - angle (in degrees) of the major axis of 
the search ellipsoid 

SANG3 - angle (in degrees) of the second 
minor axis of the search ellipsoid 

SAV - SPSS files 

SB - smaller the better 

sc - scale 

SC - set correlation 

SCDFUNC - standardized coefficients for 
canonical variables 

SCF - Studentized cumulative function 
SD - standard deviations 

sd2/sas7bdat - SAS v9 files 

SDF - Studentized density function 
SE/se/S.E. - standard error 

SEK - standard error of kurtosis 

SEM - standard error of mean 

SES - standard error of skewness 
SETCOR - Set and Canonical Correlations 
shp - shape 

SIF - Studentized inverse function 
SIMPLS - Straight-forward Implementation of 
Partial Least Squares 

SKMEAN - simple kriging mean 

SL - specification limit 

SMIN - minimum split value 

SPLOM - scatter plot matrix 

SQL - structured query language 
SORT/SQR - square-root 

SRN - Studentized random number 
SRWR - sum of rank weighted residuals 
SS - sum of squares 

SSCP - sum of squares and cross products 
STA - Statistica v5 data files 

STAND - standardized deviates 

SVD - singular value decomposition 

SW - Shapiro-Wilks 

SYC/CMD - SYSTAT command Files 


SYZ/SYD/SYS - SYSTAT data files 
SYO - SYSTAT output files 


T 

T1 - one-sample t-test 

T2 - two-sample t-test 

TANALYZE - Taguchi design: analyze 

TCF - t cumulative function 

TCOR - total correlation 

TCOV - total covariance 

TDF -t density function 

TESTAT - Test Item Analysis 

TESTATCL - classical test item analysis 
TESTATLOG - logistic item response analysis 
TETRA - tetrachoric correlations 
TGENERATE - Taguchi design: generate 

TIF - t inverse function 

TIFF - Tagged Image File Format 

TLOG - log time 

TLOSS - Taguchi's Loss Function 

TNH - hyperbolic tangent 

TOHCO - Hypothesis Testing: Zero correlation 
ТОНС1 - Hypothesis Testing: Specific 
correlation 

TOHC2 - Hypothesis Testing: Equality of two 
correlation coefficients 

ТОНРІ - Hypothesis Testing: Single 
proportion 

TOHP2 - Hypothesis Testing: Equality of two 
proportions 

ТОНТІ - Hypothesis Testing: One sample t- 
test 

TOHT2 - Hypothesis Testing: Two sample t- 
test 

TOHTPAIRED - Hypothesis Testing: Paired t- 
test 

ТОНУ1 - Hypothesis Testing: Single variance 
TOHV2 - Hypothesis Testing: Two variances 
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TOHVN - Hypothesis Testing: Several 
variances 

TOHZI - Hypothesis Testing: One sample z- 
test 

TOHZ2 - Hypothesis Testing: Two sample z- 
test 

TOL - tolerance 

TPLOT - time series plot 

TPREDICT - Taguchi design: predict 

TRCF - Triangular cumulative function 
TRDF - Triangular density function 

TRI - triangular 

TRIF - Triangular inverse function 

TRIM - trimmed mean 

TRN - t random number 

TRP - transpose 

TRRN - Triangular random number 
TSFOURIER - Fourier decomposition of time 
series 

TSIV - Two-Stage Instrumental Variables 
TSLS - Two-Stage Least Squares 

TSP - traveling salesman path 

TSQ chart - Hotelling's T? chart 
TSSMOOTH - smoothing time series 

TXT - text format 


chart - chart showing defects per unit 
CF - Uniform cumulative function 

CL - upper control limit 

DF - Uniform density function 

IF - Uniform inverse function 

NCE - uncertainty coefficient 

RN - Uniform random number 

SL - upper specification limit 

TL - upper tolerance limit 


саа ае сс 


V 


Acronyms 


VAR - variance 
VIF - variance inflation factor 


W 

WB - Weibull 

WCF - Weibull cumulative function 
WCOR - pooled within-group correlation 
WCOV - pooled within-group covariance 
WDE - Weibull density function 
WHISKER - Box-and-Whisker plot 

WIF - Weibull inverse function 

WME - Windows metafile 

WRN - Weibull random number 


X 

XCF - Chi-square cumulative function 
XDF - Chi-square density function 

XIF - Chi-square inverse function 

XLAG - separation distance between lags 
XLS - excel format 

XLTOL - tolerance for lags 

XMAX - maximum along x axis 

XMIN - minimum along x axis 

X-MR chart - Individuals and moving range 
chart 

XPT/TPT - SAS transport files 

XRN - Chi-square random number 
XTAB - Crosstabulations 


Y 
YMAX - maximum along y axis 
YMIN - minimum along y axis 


Z 

Z1 - one-sample z-test 

Z2 - two-sample z-test 

ZCF - Normal cumulative function 
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Acronyms 


ZDF - Normal density function 
ZICF - Zipf cumulative function 
ZIDF - Zipf density function 
ZIF - Normal inverse function 
ZIIF - Zipf inverse function 
ZIRN - Zipf random number 
ZMAX - maximum along z axis 
ZMIN - minimum along z axis 
ZRN - Normal random number 


ABS, 53 
ACF 
in SERIES, 337 
ACOLOR, 124 
ACS, 53 
ACT 
in SURVIVAL, 373 
AD 
in NPAR, 275 
ADD 
in CLUSTER, 145 
ADJSEASON 
in SERIES, 341 
AIC and Schwarz’s BIC, 315 
ALL 
in GLM, 201 
ALT 
in LOGIT, 212 
ALTITUDE, 119 
AMATRIX 
in GLM, 203 
in MANOVA, 229 
analysis of covariance 
using GLM, 191, 194 
analysis of variance 
ANOVA command, 128 
using GLM, 191, 194 
AND, 52 
Andrews' Fourier plots 
FOURIER command, 80 
Andrews’ Fourier plots, 70 
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CATEGORY, 130 
COVARIATE, 130 
DEPEND, 129 
ESTIMATE, 132 
HYPOTHESIS, 132 
PLENGTH, 131 
SAVE, 131 

APPEND, 29 

ArcView files, 5 

ARE, 54 

ARIMA 
in SERIES, 342 


ARL, 298 
ARRAY, 49 
ARU, 54 
ASCS, 59 
ASCII files, 5 
ASN, 53 
associations 
CORR, 150 
AT2, 53 
ATH, 53 
ATN, 53 


AUTO 
in MIX, 255 


average run length curves 
ARL command, 298 


AVG, 54 
AXES, 121 
axes, 119 
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bar charts, 64 
BAR command, 72 
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basic statistics 
CLSTEM, 362 
CRONBACH, 363 
CSTATISTICS, 359 
MNTEST, 363 
RSTATISTICS, 362 
RWSTEM, 363 
SAMPLE, 364 
SSAVE, 364 
BAYESIAN 
ESTIMATE, 135 
MODEL, 134 
SAVE, 135 
Bayesian regression 
BAYESIAN command, 134 
BC, 156 
in CORR, 156 
BEGIN, 113 
BMDP files, 5 
BMP, 45 
BOX, 74, 296 
box plots, 66 
BOX command, 296 
DENSITY command, 74 
BOXBEHNKEN 
in DESIGN, 172 
BOXHUNTER 
in DESIGN, 169 
bubble plots, 69 
BY, 39 


D 


CALCULATE, 23 

CALL 
in matrix, 237 

CANONICAL 
in RSM, 332 

canonical correlation analysis 
SETCOR command, 345 
using GLM, 191 

CAP$, 58 

CASE, 52 

CATS, 59 


CATEGORY, 20 

in ANOVA, 130, 346 

in GLM, 196 

in LOGIT, 211 

in MANOVA, 225 

in MIX, 256 

in MIXED, 260 

in PROBIT, 292 

in SETCOR, 346 

in VC, 393 
CCF 

in SERIES, 337 
CENTER, 36, 126 
CGM, 45 
CITY, 157 

in CORR, 157 
CLASSIC, 41 
classification trees 

TREES command, 387 
CLEAR 

in matrix, 235 

in SERIES, 335 
CLEAR ARRAYS, 50 
CLEAR MATRIX 

in matrix, 236 
CLEAR VARIABLES, 47 
CLSTEM 

in basic statistics, 362 
CLUSTER, 136 

ADD, 145 

IDVAR, 143 

JOIN, 137 

KMEANS, 140 

KMEDIANS, 142 

SAVE, 144 
cluster analysis 

CLUSTER command, 136 
CMATRIX 

in GLM, 203 

in MANOVA, 229 
CNTS, 58 
COD, 54, 56 
COL, 123 
COLNAME 
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COLOR, 124 
COMPLETE, 52 
CONFIG 

in MDS, 245 
CONJOINT, 146 

ESTIMATE, 147 

MODEL, 146 

SAVE, 146 
conjoint analysis 

CONJOINT command, 146 
CONSTRAIN 

in LOGIT, 215 
CONT, 161 
CONTINUOUS 

in FITDIST, 188 
CONTOUR 

in RSM, 334 
CONTRAST 

in DISCRIM, 176 

in GLM, 202 

in MANOVA, 228 
CONVERT 

in MIX, 252 
CORAN, 148 

ESTIMATE, 149 

MODEL, 148 

SAVE, 148 
CORR 

BC, 156 

CITY, 157 

CONT, 161 

CORR command, 150 

COVARIANCE, 154 

CRAMER, 162 

DICE, 164 

dichotomy coefficients, 167 

distance measures, 156 

EUCLIDEAN, 157 

GAMMA, 158 

GOWER, 167 

HAMMAN, 164 

KULCZY, 166 

LAMBDA, 161 

MU2, 159 
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PEARSON, 153 

PHI, 160 

PLENGTH, 151 

QSK, 156 

S7, 165 

SAMPLE, 152 

SAVE, 152 

similarity measures, 156 

SNEATH, 165 

SPEARMAN, 158 

SSCP, 155 

TAUB, 159 

TAUC, 160 

TETRA, 163 

UNCE, 162 

YULEQ, 163 
correlations 

binary data, 163 

canonical, 345 

continuous data, 153 

CORR, 150 

rank order data, 158 

set, 345 
correspondence analysis 

CORAN command, 148 


COS, 53 
COVARIANCE, 154 
in CORR, 154 
COVARIATE 
in ANOVA, 130 


CRAMER 
in CORR, 162 


crosstabulation 
XTAB command, 400 


CSIZE, 116, 126 


CSTATISTICS 
in basic statistics, 359 


cumulative sum charts 
CUSUM command, 299 


CUSTOM 
in RSM, 330 


CUSUM, 299 
CUT, 35, 56, 126 


DASH, 124 
DAT, 57 
Data Interchange Format files, 5 
date functions, 57 
dBase files, 5 
DBF, 5 
DC 
in LOGIT, 214 
define variables 
DEFVAR command, 33 
DEFVAR, 33 
Delaunay triangulations, 69 
DELETE COLUMNS, 35 
DELETE ROWS, 35 
DENSITY, 74 
density displays, 66 
density stripes 
DENSITY command, 74 
DEPEND 
in ANOVA, 129 
DEPTH, 116 
DESIGN, 168 
BOXBEHNKEN, 172 
BOXHUNTER, 169 
FACTORIAL, 169 
LATIN, 170 
MIXTURE, 173 
PLACKETT, 172 
SAVE, 169 
TAGUCHI, 171 
design of experiments 
DESIGN command, 168 
DESIRABILITY 
in RSM, 333 
DIAGONAL, 5 
DIALOG, 13 
dialog boxes 
calling from commands, 13 
DICE 
in CORR, 164 
dichotomy coefficients, 167 
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DIFFERENCE 
in SERIES, 338 
DIRECTION, 124 
DIRECTORY 
in matrix, 237 
DISCRETE 
in FITDIST, 187 
DISCRIM, 174 
CONTRAST, 176 
ESTIMATE, 178 
MODEL, 175 
PLENGTH, 176 
SAVE, 178 
START, 178 
STEP, 179 
STOP, 181 
discriminant analysis 
classical discriminant analysis, 174 
DISCRIM command, 174 
robust discriminant analysis, 312 | 
using GLM, 191 
distance measures 
CORR, 150 
distributions, 60 
dit plots, 66 
DMATRIX 
in GLM, 204 
in MANOVA, 230 
in MIXED, 265 
in VC, 398 
DOC, 57 
DOT, 76 
dot charts 
DOT command, 76 
dot density plots 
DENSITY command, 74 
dot plots, 66 
DOWS, 57 
DRAW, 78 
draw objects 
DRAW command, 78 
DROP, 35 
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DUAL, 123 
DWORK, 7 
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ECHO, 40 
EFFECT 
in GLM, 201 
in MANOVA, 227 
ELSE, 49 
EMF, 45 
END, 113 
EPS, 45 
ERROR, 124 
in GLM, 205 
in MANOVA, 230 
in SETCOR, 346 
error bars, 124 
ESAVE, 6 
ESTIMATE, 329 
in ANOVA, 132 
in BAYESIAN, 135 
in CONJOINT, 147 
in CORAN, 149 
in DISCRIM, 178 
in FACTOR, 184 
in GLM, 198 
in LOGIT, 211 
in LOGLIN, 218 
in MANOVA, 227 
in MDS, 246 
in MISSING, 249 
in MIX, 258 
in MIXED, 263 
in NONLIN, 271 
in PERMAP, 279 
in PLS, 282 
in POSAC, 283 
in POWER, 291 
in PROBIT, 293 
in RAMONA, 309 
in RDISCRIM, 314 
in REGRESS, 318 
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in RIDGE, 323 
in ROBREG, 329 
in RSM, 332 
in SETCOR, 347 
in SIGNAL, 349 
in SMOOTH, 351 
in SURVIVAL, 369 
in TESTAT, 376 
in TREES, 388 
in TSLS, 390 
in VC, 396 
ETHICK, 124 
ETYPE, 124 
EUCLIDEAN, 157 
in CORR, 157 
EWMA, 301 
EWORK, 7 
Excel files, 5 
EXIT, 27 
EXP, 53 
EXPONENTIAL 
in SERIES, 340 
EXPORT, 8 
EXTRACT, 30 
EYE, 114 


G 


FACET, 115 
FACTOR, 182 
ESTIMATE, 184 
in GLM, 206 
MODEL, 182 
PLENGTH, 183 
SAVE, 183 
factor analysis 
FACTOR command, 182 
FACTORIAL 
in DESIGN, 169 
FCOLOR, 124 
files 
file locations, 25 
FILL, 124 
FITDIST 
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CONTINUOUS, 188 
DISCRETE, 187 
SAVE, 189 
Fitting Distribution 
FITDIST command, 187 
FIX 
in NONLIN, 272 
FMATRIX 
in MIXED, 264 
in VC, 398 
FONT, 115 
FOR...NEXT, 48 
FORMAT, 41 
in matrix, 235 
FOURIER, 80 
in SERIES, 344 
FPATH, 25 
FPLOT, 81 
FREQ, 19 


frequency polygons 
DENSITY command, 74 
FRIEDMAN 
in NPAR, 275 
function plots 
FPLOT command, 81 
FUNCTION...RETURN, 50 


FUNPAR 
in NONLIN, 270 
in SURVIVAL, 368 


fuzzygrams, 67 
DENSITY command, 74 


H 
GAMMA, 158 
in CORR, 158 
gap displays, 66 
general linear models 
GLM command, 191 
GET, 3 


GLM, 191 
ALL, 201 
AMATRIX, 203 


analyses, 194 
CATEGORY, 196 
CMATRIX, 203 
CONTRAST, 202 
DMATRIX, 204 
EFFECT, 201 
ERROR, 205 
ESTIMATE, 198 
FACTOR, 206 
HYPOTHESIS, 201 
MODEL, 195 
PAIRWISE, 205 
PLENGTH, 197 
POST, 204 
PRIORS, 206 
ROTATE, 207 
SAVE, 196, 207 
SPECIFY, 202 
STAND, 206 
START, 198 
STEP, 199 
STOP, 200 
TEST, 207 
TYPE, 207 
WITHIN, 202 
GOWER 
in CORR, 167 
GPRINT, 44 
GRAPH, 43 
graphs 
axes, 121 
axis labels, 116, 119 
colors, 117, 124 
control limits, 124 
curves, 126 
error bars, 124 
fill patterns, 124 
font, 115 
formatting scales, 126 
global features, 113 
GPRINT command, 44 
GSAVE command, 45 
labels, 126 
legends, 116, 119 
lines, 116, 124 
local options, 118 
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multigroup displays, 123 

multivariable displays, 123 

ordering categories, 123 

overlaying plots, 113, 115, 123 

perspective, 114 

position, 113, 119 

printing, 44 

saving, 45 

scales, 116, 121 

size, 114, 119 

surfaces, 126 

symbols, 120 

titles, 116, 119 

transforming data, 123 

Trellis displays, 123 
GRID 

in SPATIAL, 355 
GROUP, 123 


GSAVE, 45 
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HAMMAN 

in CORR, 164 
HAZARD 

in SURVIVAL, 373 
HEIGHT, 119 
HELP, 23 
high-low-close plots, 69 
Histogram command 

HIST command, 74, 295 
histograms, 66 

DENSITY command, 74 
HTML format, 45 
HYPOTHESIS 

in ANOVA, 132 

in GLM, 201 

in LOGIT, 215 

in MANOVA, 227 

in MIXED, 264 

in REGRESS, 320 

in VC, 397 
hypothesis testing 

test for correlations, 383 

test for variances, 382 
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tests for means, 378 
tests for proportions, 385 
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icon plots, 70 

ICON command, 83 
IDENTIFIER 

in MIX, 254 
IDVAR, 22 

in CLUSTER, 143 
IF... THEN, 48 
IMPORT, 5 
INC, 54, 56 
IND, 58 
INDEX 

in SERIES, 338 


influence plots, 69 

INPUT, 3, 4 

INSERT, 32 

INT, 53 

item-response analysis 
TESTAT command, 374 
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jittered density plots, 67 


jittered dot density plots 
DENSITY command, 74 


JOIN 
in CLUSTER, 137 


JPG, 45 


L 
KEY 

in TESTAT, 376 
KILL, 27 
KMEANS 

in CLUSTER, 140, 142 
KMEDIANS 

in CLUSTER, 142 
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KRIG 
in SPATIAL, 357 


KRUSKAL 
in NPAR, 273 


KS 
in NPAR, 274, 276 


KULCZY 
in CORR, 166 
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L10, 53 

LABS, 56, 59 
LABEL, 21, 126 


LAD, 325 

in ROBREG, 325 
LAG, 53 
LAMBDA 

in CORR, 161 
landscape orientation, 44 
LATENT 

in RAMONA, 307 
LATIN 

in DESIGN, 170 
LDISPLAY, 42, 61 


LEGEND, 119 
LFTS, 58 
LGM, 53 
LINE, 86 
line charts 
LINE command, 86 
linear mixed models 
MIXED command, 259 
linear regression 
REGRESS command, 315 
LINK, 13 
LIST, 9 
list cases 
LIST command, 9 
LLABEL, 119 
LMS, 326 


in ROBREG, 326 
LNOTE, 10 


LOC, 119 
LOG, 53 
in SERIES, 338 
logical operators, 52 
logistic regression 
LOGIT command, 208 
LOGIT, 208 
ALT, 212 
CATEGORY, 211 
CONSTRAIN, 215 
DC, 214 
ESTIMATE, 211 
HYPOTHESIS, 215 
MODEL, 209 
NCAT, 212 
QNTL, 214 
SAVE, 214 
SET, 212 
SIMULATE, 215 
START, 213 
STEP, 213 
STOP, 213 
TEST, 216 
LOGLIN, 217 
ESTIMATE, 218 
MODEL, 218 
PLENGTH, 220 
SAVE, 220 
TABULATE, 221 
ZERO, 219 
loglinear modeling 
LOGLIN command, 217 
LOSS 
in NONLIN, 267 
Lotus files, 5 
LOWS, 58 
LPD$, 58 
LTAB 
in SURVIVAL, 371 
LTITLE, 119 
LTS, 327 
in ROBREG, 327 
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M, 325 
in ROBREG, 325 
MA, 300 
MANIFEST 
in RAMONA, 307 
MANOVA 
AMATRIX, 229 
CATEGORY, 225 
CMATRIX, 229 
CONTRAST, 228 
DMATRIX, 230 
EFFECT, 227 
ERROR, 230 
ESTIMATE, 227 
HYPOTHESIS, 227 
MODEL, 224 
PAIRWISE, 230 
PLENGTH, 226 
POST, 230 
SAVE, 226 
SPECIFY, 229 
TEST, 231 
WITHIN, 228 
MAP, 88 
maps 
MAP command, 88 
MAT 
in matrix, 233, 237 
matrices 
generation of, 238 
manipulation of, 239 
matrix algebra, 239 
transformations, 241, 242 
MATRIX, 123 
matrix, 232 
CALL, 237 
CLEAR, 235 
CLEAR MATRIX, 236 
COLNAME, 234 
DIRECTORY, 237 
FORMAT, 235 
MAT, 233,237 


MDELETE COLUMN, 236 


MDELETE ROW, 236 


MLET, 237 
MSAVE, 235 
MSELECT, 236 
ROWNAME, 234 
SHOW, 235 
USE, 234 
MAX, 54 
MDELETE COLUMN 
in matrix, 236 
MDELETE ROW 
in matrix, 236 
MDS, 244 
CONFIG, 245 
ESTIMATE, 246 
MODEL, 245 
SAVE, 246 


MEAN 
in SERIES, 338 

MERGE, 28 

MIDS, 58 

MIN, 54 

minimal spanning trees, 69 

MIS, 54 

MISSING, 248 
ESTIMATE, 249 
in SERIES, 336 
MODEL, 248 
SAVE, 249 

missing value analysis 
MISSING command, 248 

MIX, 251, 280 
AUTO, 255 
CATEGORY, 256 
CONVERT, 252 
ESTIMATE, 258 
IDENTIFIER, 254 
MODEL, 254 
PLENGTH, 256 
RANDOM, 254 
SAVE, 257 

MIXED, 259 
CATEGORY, 260 
DMATRIX, 265 
ESTIMATE, 263 
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FMATRIX, 264 
HYPOTHESIS, 264 
MODEL, 260 
PAIRWISE, 264 
PLENGTH, 260 
RANDOM, 261 
REPEATED, 262 
RESET, 259 
RMATRIX, 265 
SAVE, 262 
TEST, 265 
mixed regression 
MIX command, 251 
MIXTURE 
in DESIGN, 173 
MKTEST 
in SERIES, 343 
MLET 
in matrix, 237 
MNTEST 
in basic statistics, 363 
MOD, 53 
MODEL, 322, 325 
in BAYESIAN, 134 
in CONJOINT, 146 
in CORAN, 148 
in DISCRIM, 175, 312 


in FACTOR, 182 multivariate analysis of variance 
a GLM, 195 MANOVA command, 222 

in LOGIT, 209 multivariate multiple regression 
in LOGLIN, 218 using MANCOVA, 223 

in MANOVA, 224 

in MDS, 245 О 

in MISSING, 248 NAHAZARD 

in MIX, 254 - 

їп МЇХЕР, 260 їп SURVIVAL, 372 

in NONLIN, 267 NAMES, 22 

in PERMAP, 278 NCAT 

in PLS, 280 in LOGIT, 212 

in POSAC, 283 nd, 317 

in POWER, 284 NEW, 3 


in PROBIT, 292 
in RAMONA, 306 
in REGRESS, 316 
in RIDGE, 322 

in ROBREG, 325 


in RSM, 331 

in SETCOR, 345 

in SIGNAL, 348 

in SMOOTH, 350 

in SPATIAL, 353 

in SURVIVAL, 367 

in TESTAT, 374 

in TREES, 387 

in TSLS, 389 

in VC, 392 
MONOCHROME, 117 
moving average charts 

EWMA command, 301 

MA command, 300 
MSAVE 

in matrix, 235 
MSELECT 

in matrix, 236 
MU2, 159 

in CORR, 159 
multidimensional scaling 

MDS command, 244 
MULTIPLOT, 123 
multiplots, 70 


multivariate analysis of covariance 


using MANOVA, 222 


non parametric tests 
one sample tests, 275 

NONLIN, 266 
ESTIMATE, 271 
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FIX, 272 
FUNPAR, 270 
LOSS, 267 
MODEL, 267 
PLENGTH, 269 
RESET, 268 
ROBUST, 268 
SAVE, 270 

nonlinear models 
NONLIN command, 266 


nonparametric tests 


independent samples tests, 273, 274 


NPAR command, 273 
one sample tests, 276, 277 


related variables tests, 274, 275 


normal density curve, 66 

NOT, 52 

NOTE, 10 

NOWS, 57 

NPAR, 273 
AD, 275 
FRIEDMAN, 275 
KRUSKAL, 273 
KS, 274, 276 
QUADE, 275 
RUNS, 277 
SAVE, 277 
SIGN, 274 
WILCOXON, 274 

NUM, 54 


P 
OC, 298 
OCHIAI 
in CORR, 166 
ODBC 
IMPORT command, 5 


operating characteristic curves 
OC command, 298 


OPTIMIZE 
in RSM, 334 


OPTIONS, 23 
OR, 52 
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ORDER, 43 
ORIGIN, 113 
OSAVE, 45 
OUTPUT, 39 
output 
saving, 45 
text output, 41 
turning tables off, 41 


OUTPUT command, 39 
OVERLAY, 123 
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PACF 

in SERIES, 337 
PAGE, 40 
PAIRWISE 

in GLM, 205 

in MANOVA, 230 

in MIXED, 264 

in VC, 397 
PARALLEL, 90 


parallel coordinate displays 
PARALLEL command, 90 


parallel coordinate plots, 70 


Pareto charts 
PARETO command, 295 


partial least squares regression 
PLS command, 280 


partially ordered scalogram analysis with coordi- 
nates 
POSAC command, 283 


PASTE, 35 


path analysis 
RAMONA command, 306 


PCA, 304 


PCNTCHANGE 
in SERIES, 339 


PEARSON, 153 
in CORR, 153 
perceptual mapping 
PERMAP command, 278 
PERMAP, 278 
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ESTIMATE, 279 
MODEL, 278 
PHI 
in CORR, 160 
PICT, 45 
PIE, 91 
pie charts, 65 
PIE command, 91 
PLACKETT 
in DESIGN, 172 
PLENGTH, 41, 328, 368 
in ANOVA, 131 
in DISCRIM, 176 
in FACTOR, 183 
in GLM, 197 
in LOGLIN, 220 
in MANOVA, 226 
in MIX, 256 
in MIXED, 260 
in NONLIN, 269 
in PLS, 281 
in QC, 295 
in RAMONA, 308 
in RDISCRIM, 313 
in REGRESS, 317 
in ROBREG, 328 
in SETCOR, 346 
in SURVIVAL, 368 
in TESTAT, 375 
in TREES, 388 
in VC, 393 
in XTAB, 401 
PLOT, 92 
plots 
see scatterplots 
PLS, 280 
ESTIMATE, 282 
MODEL, 280 
PLENGTH, 281 
SAVE, 281 
POINT 
in SPATIAL, 358 
POISSON 
in TESTING, 381 
POP, 25 


POR, 5 
portrait orientation, 44 
POSAC, 283 
ESTIMATE, 283 
MODEL, 283 
SAVE, 283 
POST 
in GLM, 204 
in MANOVA, 230 
POWER, 284 
ESTIMATE, 291 
MODEL, 284 
SAVE, 290 
power analysis 
POWER command, 284 
PPLOT, 98 
PREDICT 
in REGRESS, 321 
Principal components analysis 
using GLM, 191 
PRINT 
PRIORS 
in GLM, 206 
probability plots, 68 
PPLOT command, 98 
PROBIT, 292 
CATEGORY, 292 
ESTIMATE, 293 
MODEL, 292 
SAVE, 293 
probit analysis 
PROBIT command, 292 
process control analysis charts 
PCA command, 304 
PROFILE, 102 
profile charts, 64 
PROFILE command, 102 
PROP 
in TESTING, 386 
PUSH, 24 
PUT, 7 
PUTS, 58 
PYRAMID, 104 
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pyramid charts, 64 
PYRAMID command, 104 
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QC, 294 
ARL, 298 
BOX, 296 
CUSUM, 299 
EWMA, 301 
HIST, 295 
MA, 300 
OC, 298 
PARETO, 295 
PCA, 304 
PLENGTH, 295 
QCREGRESS, 302 
RUNCHART, 296 
SAVE, 294 
SHEWHART, 297 
TSQ, 303 
XMR, 301 
QCREGRESS 
in QC, 302 
QNTL 
in LOGIT, 214 
in SURVIVAL, 372 
QPLOT, 106 
QSK, 156 
in CORR, 156 
QUADE 
in NPAR, 275 
quality control charts 
QC command, 294 
quantile plots, 68 
QPLOT command, 106 
Quick Graphs, 43 
QUIT, 27 
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RAMONA, 306 
ESTIMATE, 309 
LATENT, 307 
MANIFEST, 307 
MODEL, 306 
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PLENGTH, 308 
RANDOM 
in MIX, 254 
in MIXED, 261 
in VC, 393 
random numbers 
RSEED command, 23, 24 
random sampling 
RANDSAMP command, 310 
univariate discrete random sampling, 310 
univariate random sampling, 310 
RANDSAMP 
UNIVARIATE, 310 
RANDSAMP command 
in random sampling, 310 
RANK, 37 
RDISCRIM 
ESTIMATE, 314 
MODEL, 312 
PLENGTH, 313 
SAVE, 314 
RECODE, 37 
REGRESS, 315 
ESTIMATE, 318 
HYPOTHESIS, 320 
MODEL, 316 
PLENGTH, 317 
PREDICT, 321 
SAMPLE, 318 
SAVE, 317 
START, 319 
STEP, 319 
STOP, 320 
regression 
bayesian, 134 
linear, 191, 194, 315 
logistic, 208 
nonlinear, 266 
polynomial, 191 
two-stage least squares, 389 
regression charts 


regression trees 
TREES command, 387 


relational operators, 52 
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RELIABILITY 
in SURVIVAL, 372 
REM, 11 
REPEAT, 32, 123 
REPEATED 
in MIXED, 262 
Resampling 
SAMPLE, 51 
RESET 
in NONLIN, 268 
in VC, 392 
response surface methods 
RSM command, 330 
RGTS, 58 
RIDGE, 322 
ESTIMATE, 323 
in RSM, 333 
MODEL, 322 
SAVE, 323 
ridge regression 
RIDGE command, 322 
RMATRIX 
in MIXED, 265 
in VC, 398 
RNDGEN, 24 
ROBREG, 324 
ESTIMATE, 329 
LAD, 325 
LMS, 326 
LTS, 327 
M, 325 
MODEL, 325 
PLENGTH, 328 
RRANK, 327 
S, 327 
SAVE, 328 
ROBUST 
in NONLIN, 268 
robust discriminant analysis 


RDISCRIM command, 312 


robust regression 

ROBREG command, 324 
ROC curves, 348 
ROTATE 


in GLM, 207 
ROW, 123 
ROWNAME 
in matrix, 234 
RPDS, 58 
RRANK, 327 
in ROBREG, 327 
RSEED, 23, 24 
RSM, 330 
CANONICAL, 332 
CONTOUR, 334 
CUSTOM, 330 
DESIRABILITY, 333 
ESTIMATE, 332 
MODEL, 331 
OPTIMIZE, 334 
RIDGE, 333 
SAVE, 331 
SURFACE, 334 
RSTATISTICS 
in basic statistics, 362 
RTF, 45 
RUNCHART 
in QC, 296 
RUNS 
in NPAR, 277 
RWSTEM 
in basic statistics, 363 
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in ROBREG, 327 
S7 
in CORR, 165 
SAMPLE 
in basic statistics, 364 
in CORR, 152 
in REGRESS, 318 
SAS files, 5 
SAV, 5 
SAVE, 152, 328 
in ANOVA, 131 
in BAYESIAN, 135 
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in CLUSTER, 144 

in CONJOINT, 146 

in CORAN, 148 

in CORR, 152 

in DESIGN, 169 

in DISCRIM, 178 

in FACTOR, 183 

in FITDIST 

in GLM, 196, 207 

in LOGIT, 214 

in LOGLIN, 220 

in MANOVA, 226 

in MDS, 246 

in MISSING, 249 

in MIX, 257 

in MIXED, 262 

in NONLIN, 270 

in NPAR, 277 

in PLS, 281 

in POSAC, 283 

in POWER, 290 

in PROBIT, 293 

in QC, 294 

in RDISCRIM, 314 

in REGRESS, 317 

in RIDGE, 323 

in ROBREG, 328 

in RSM, 331, 333 

in SERIES, 336 

in SIGNAL, 348 

in SMOOTH, 351 

in SPATIAL, 355 

in SURVIVAL, 371 

in TESTAT, 375 

in TSLS, 390 

in UNIVARIATE, 311 

in VC, 395 

in XTAB, 404 
saving 

graphs, 45 

output, 45 
SCALE, 114, 121 
scatterplot matrices, 70 

SPLOM command, 109 
scatterplots, 69 

PLOT command, 92 
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SD2, 5 
SELECT, 38 
select cases 
SELECT command, 38 
SERIES, 335 
ACF, 337 
ADJSEASON, 341 
ARIMA, 342 
CCF, 337 
CLEAR, 335 
DIFFERENCE, 338 
EXPONENTIAL, 340 
FOURIER, 344 
INDEX, 338 
LOG, 338 
MEAN, 338 
MISSING, 336 
MKTEST, 343 
PACF, 337 
PCNTCHANGE, 339 
SAVE, 336 
SMOOTH, 340 
SQUARE, 339 
STEST, 343 
TAPER, 339 
TIME, 336 
TPLOT, 336 
TREND, 339 
SERROR, 124 
SET 
in LOGIT, 212 
set correlations 
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CATEGORY, 346 
ERROR, 346 
ESTIMATE, 347 
MODEL, 345 
PLENGTH, 346 
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SHP, 5 
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SIGNAL, 348 
ESTIMATE, 349 
MODEL, 348 
SAVE, 348 
signal detection analysis 
SIGNAL command, 348 
SIMULATE 
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in SPATIAL, 358 
SIN, 53 
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SLE, 54 
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SMOOTH, 350 
ESTIMATE, 351 
in SERIES, 340 
MODEL, 350 
SAVE, 351 
smoothing 
SMOOTH command, 350 
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SNEATH 
in CORR, 165 
SORT, 28 
sort cases 
SORT command, 28 
SPATIAL, 353 
GRID, 355 
KRIG, 357 
MODEL, 353 
POINT, 358 
SAVE, 355 
SIMULATE, 358 
TREND, 354 
VARIOGRAM, 356 
Spatial statistics 
SPATIAL command, 353 
SPEARMAN, 158 
in CORR, 158 
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SPLOM, 109 
SPLOMs, 70 
SPSS files, 5 
SQL 

IMPORT command, 5 
SQR, 53 
SQUARE 
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SSAVE 
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STACK, 32 
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in GLM, 198 
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STEP 
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in REGRESS, 319 
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STEST 
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STOP 
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in REGRESS, 320 
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SUM, 54 
SURFACE, 126 
in RSM, 334 
SURVIVAL, 366 
ACT, 373 
ESTIMATE, 369 
FUNPAR, 368 
HAZARD, 373 
LTAB, 371 
MODEL, 367 
NAHAZARD, 372 
PLENGTH, 368 
QNTL, 372 
RELIABILITY, 372 
SAVE, 371 
START, 369 
STEP, 370 
STOP, 370 
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SURVIVAL command, 366 
SYMBOL, 120 
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tables 

turning tables off, 41 
TABULATE 

in LOGLIN, 221 

in XTAB, 405 
TAGUCHI 

in DESIGN, 171 
TAN, 53 
TAPER 

in SERIES, 339 
TAUB, 159 
TAUC 

in CORR, 160 
TCORR 

in TESTING, 383, 384 
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in MIXED, 265 
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specific correlation, 384 
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one-sample t-test, 380 
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test item analysis 
TESTAT command, 374 
TESTAT, 374 
ESTIMATE, 376 
KEY, 376 
MODEL, 374 
PLENGTH, 375 
SAVE, 375 
TESTING, 381 
POISSON, 381 
PROP, 386 
TCORR, 383, 384 
TTEST, 380 
VARI, 382, 383 
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THICK, 116 
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TIME 
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time functions, 57 
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SERIES command, 335 
TITLE, 119 
TNH, 53 
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TPLOT 
in SERIES, 336 
TPT, 5 
transformations, 123 
in matrix, 241, 242 
TRANSPOSE, 32, 121 
TREES, 387 
ESTIMATE, 388 
MODEL, 387 
PLENGTH, 388 
Trellis displays, 70, 123 
TREND 
in SERIES, 339 
in SPATIAL, 354 
TRIM, 30 
TSLS, 389 
ESTIMATE, 390 
MODEL, 389 
SAVE, 390 
TSQ, 303 
Tsquare charts 
TSQ command, 303 
TTEST, 377, 381 
in TESTING, 380, 381 
two-stage least squares 
TSLS command, 389 
TXT, 5 
TYPE, 5 
in GLM, 207 
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UNCE 
in CORR, 162 
UNIVARIATE 
in RANDSAMP, 310 
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UNWRAP, 31 
UPRS, 58 
USE, 2 

in matrix, 234 
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VAL, 57, 58 
VARI 
in TESTING, 382, 383 
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VC command, 392 
VARIOGRAM 
in SPATIAL, 356 
VARLABEL, 21 
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CATEGORY, 393 
DMATRIX, 398 
ESTIMATE, 396 
FMATRIX, 398 
HYPOTHESIS, 397 
MODEL, 392 
PAIRWISE, 397 
PLENGTH, 393 
RANDOM, 393 
RESET, 392 
SAVE, 395 
TEST, 399 
VDISPLAY, 42 
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WEIGHT, 20 
WFORMAT, 126 
WHILE...ENDWHILE, 49 
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WILCOXON 
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WKS, 5 
WLOG, 123 
WMAX, 121 
WMF, 45 
WMIN, 121 
WPIP, 121 
WPOW, 123 
WRAP, 31 
WREV, 123 
WRITE, 112 


write text 
WRITE command, 112 
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XGRID, 124 
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XPIP, 121 
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YTICK, 121 
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